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Vector Identities 


A — A x x + A y y + A z z, A 2 — A" + A~ + A 2 , A • B — A X B X + A y B y + A Z B Z 


Ax B = 


A z 


Ax 

a 2 



Ay 

B z 

x — 

B x 

B z 

y + 

B x 

By 


A ■ (B x C) = 


Ax Ay 
-Bf By 
C X Cy 


a 2 

c 2 


= Cx 





Ax 

-Bj 


A 2 

B z 


+ Cz 


Ax 

B, 


A 

5, 


A X (B X C) — B A • C C A - B, ^ ' &ijk£pqk — ^ipSjq &iq&jp 

k 


Vector Calculus 


r dV dV df 

F = -W(r) = — — = -f—, V • (r/(r)) = 3/(r) + r-f, 
r dr dr dr 

V ■ (rr™- 1 ) = (n+2)r n ~ 1 


V(A • B) = (A • V)B + (B • V)A + A x (V x B) + B x (V x A) 

V ■ (BA) = VS- A + BV ■ A, V • (A x B) = B ■ (V x A) - A - (V x B) 
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Curved Orthogonal Coordinates 

Cylinder Coordinates 

Qi = P, q -2 = <P, 93 = «; h l =hp = 1, h 2 = \ = p, h 3 = h z = 1, 

r = xp cos <p + yp sin ip + zz 
Spherical Polar Coordinates 

9i =r, q 2 = 0, q 3 =(p-,hi = h r = 1, h 2 = h e = r, h 3 =h (p = r sind, 
r = xr sin 0 cos q> + y r sin 9 sin (p + zr cos 9 
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Mathematical Constants 

e = 2.718281828, 7 r = 3.14159265, In 10 = 2.302585093, 
1 rad = 57.29577951°, 1° = 0.0174532925 rad, 


Y = lim 


l + I + i 

2 3 


-ln(n + 1) 

n 


= 0.577215661901532 
(Euler-Mascheroni number) 


Bi = — i, B 2 = /j'i = , Be = ... (Bernoulli numbers) 
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This text is designed for the usual introductory physics courses to prepare un¬ 
dergraduate students for the level of mathematics expected in more advanced 
undergraduate physics and engineering courses. One of its goals is to guide 
the student in learning the mathematical language physicists use by leading 
them through worked examples and then practicing problems. The pedagogy 
is that of introducing concepts, designing and refining methods, and practicing 
them repeatedly in physics examples and problems. Geometric and algebraic 
approaches and methods are included and are more or less emphasized in 
a variety of settings to accommodate different learning styles of students. 
Sometimes examples are solved in more than one way. Theorems are usually 
derived sketching the underlying ideas and describing the relevant mathemat¬ 
ical relations so that one can recognize the assumptions they are based on and 
their limitations. These proofs are not rigorous in the sense of the professional 
mathematician, and no attempt was made to formulate theorems in their most 
general form or under the least restrictive assumptions. 

An important objective of this text is to train the student to formulate 
physical phenomena in mathematical language, starting from intuitive and 
qualitative ideas. The examples in the text have been worked out so as to 
develop the mathematical treatment along with the physical intuition. A precise 
mathematical formulation of physical phenomena and problems is always the 
ultimate goal. 


Text Overview 


In Chapter 1 the basic concepts of vector algebra and vector analysis are in¬ 
troduced and applied to classical mechanics and electrodynamics. Chapter 
2 deals with the extension of vector algebra and analysis to curved orthogo¬ 
nal coordinates, again with applications from classical mechanics and elec¬ 
trodynamics. These chapters lay the foundations for differential equations in 
Chapters 8, 9, and 16; variational calculus in Chapter 18; and nonlinear analy¬ 
sis in Chapter 19. Chapter 3 extends high school algebra of one or two linear 

xix 
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equations to determinants and matrix solutions of general systems of linear 
equations, eigenvalues and eigenvectors, and linear transformations in real 
and complex vector spaces. These chapters are extended to function spaces 
of solutions of differential equations in Chapter 9, thereby laying the math¬ 
ematical foundations for and formulation of quantum mechanics. Chapter 4 
on group theory is an introduction to the important concept of symmetry in 
modem physics. Chapter 5 gives a fairly extensive treatment of series that 
form the basis for the special functions discussed in Chapters 10-13 and also 
complex functions discussed in Chapters 6 and 7. Chapter 17 on probability 
and statistics is basic for the experimentally oriented physicist. Some of its 
content can be studied immediately after completion of Chapters 1 and 2, but 
later sections are based on Chapters 8 and 10. Chapter 19 on nonlinear methods 
can be studied immediately after completion of Chapter 8, and it complements 
and extends Chapter 8 in many directions. Chapters 10-13 on special functions 
contain many examples of physics problems requiring solutions of differen¬ 
tial equations that can also be incorporated in Chapters 8 and 16. Chapters 14 
and 15 on Fourier analysis are indispensible for a more advanced treatment of 
partial differential equations in Chapter 16. 

Historical remarks are included that detail some physicists and mathemati¬ 
cians who introduced the ideas and methods that later generations perfected 
to the tools we now use routinely. We hope they provide motivation for stu¬ 
dents and generate some appreciation of the effort, devotion, and courage of 
past and present scientists. 


Pathways through the Material 


Because the text contains more than enough material for a two-semester un¬ 
dergraduate course, the instructor may select topics to suit the particular level 
of the class. Chapters 1-3 and 5-8 provide a basis for a one-semester course in 
mathematical physics. By omitting some topics, such as symmetries and group 
theory and tensors, it is possible in a one-semester course to also include parts 
of Chapters 10-13 on special functions, Chapters 14 and 15 on Fourier analysis, 
Chapter 17 on probability and statistics, Chapter 18 on variational calculus, or 
Chapter 19 on nonlinear methods. 

A two-semester course can treat tensors and symmetries in Chapters 2 and 4 
and special functions in Chapters 10-13 more extensively, as well as variational 
calculus in Chapter 18 in support of classical and quantum mechanics. 


Problem-Solving Skills 


Students should study the text until they are sure they understand the physi¬ 
cal interpretation, can derive equations with the book closed, can make pre¬ 
dictions in special cases, and can recognize the limits of applicability of the 
theories and equations of physics. However, physics and engineering courses 
routinely demand an even higher level of understanding involving active learn¬ 
ing in which students can apply the material to solve problems because it is 
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common knowledge that we only learn the mathematical language that physi¬ 
cists use by repeatedly solving problems. 

The problem sets at the end of sections and chapters are arranged in the or¬ 
der in which the material is covered in the text. A sufficient variety and level of 
difficulty of problems are provided to ensure that anyone who conscientiously 
solves them has mastered the material in the text beyond mere understanding 
of step-by-step derivations. More difficult problems that require some mod¬ 
ification of routine methods are also included in various sets to engage the 
creative powers of the student, a skill that is expected of the professional 
physicist. 

Computer Software 

Problems in the text that can be solved analytically can also be solved by mod¬ 
ern symbolic computer software, such as Macsyma, Mathcad, Maples, Mathe- 
matica, and Reduce, because these programs include the routine methods of 
mathematical physics texts. Once the student has developed an analytical re¬ 
sult, these powerful programs are useful for checking and plotting the results. 
Finding an analytical solution by computer without understanding how it is 
derived is pointless. When computers are used too early for solving a problem, 
many instructors have found that students can be led astray by the computers. 
The available computer software is so diverse as to preclude any detailed dis¬ 
cussion of it. Each instructor willing to make use of computers in the course 
will have to make a choice of a particular software and provide an introduc¬ 
tion for the students. Many problems and examples in the text may then be 
adapted to it. However, their real utility and power lie in the graphics software 
they include and the ability to solve problems approximately and numerically 
that do not allow for an analytical solution. Special training is needed, and the 
text can be used to train students in approximation methods, such as series 
and asymptotic expansions, or integral representations that are suitable for 
further symbolic computer manipulations. 
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Chapter 1 



Vector Analysis 


1.1 Elementary Approach 


In science and engineering we frequently encounter quantities that have only 
magnitude: mass, time, and temperature. This magnitude remains the same no 
matter how we orient the coordinate axes that we may use. These quantities 
we label scalar quantities. In contrast, many interesting physical quantities 
have magnitude or length and, in addition, an associated direction. Quanti¬ 
ties with magnitude and direction are called vectors. Their length and the 
angle between any vectors remain unaffected by the orientation of coordi¬ 
nates we choose. To distinguish vectors from scalars, we identify vector quan¬ 
tities with boldface type (i.e., V). Vectors are useful in solving systems of 
linear equations (Chapter 3). They are not only helpful in Euclidean geom¬ 
etry but also indispensable in classical mechanics and engineering because 
force, velocity, acceleration, and angular momentum are vectors. Electrody¬ 
namics is unthinkable without vector fields such as electric and magnetic 
fields. 

Practical problems of mechanics and geometry, such as searching for the 
shortest distance between straight lines or parameterizing the orbit of a parti¬ 
cle, will lead us to the differentiation of vectors and to vector analysis. Vector 
analysis is a powerful tool to formulate equations of motions of particles 
and then solve them in mechanics and engineering, or field equations of 
electrodynamics. 

In this section, we learn to add and subtract vectors geometrically and 
algebraically in terms of their rectangular components. 

A vector may be geometrically represented by an arrow with length pro¬ 
portional to the magnitude. The direction of the arrow indicates the direction 
of the vector, the positive sense of direction being indicated by the point. In 


1 
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Figure 1.1 

Triangle Law of 
Vector Addition 



Figure 1.2 

Vector Addition Is 
Associative 



this representation, vector addition 

C = A + B (1.1) 

consists of placing the rear end of vector B at the point of vector A (head to 
tail rule). Vector C is then represented by an arrow drawn from the rear of A 
to the point of B. This procedure, the triangle law of addition, assigns mean¬ 
ing to Eq. (1.1) and is illustrated in Fig. 1.1. By completing the parallelogram 
(sketch it), we see that 


C = A + B = B + A. (1.2) 

In words, vector addition is commutative. 

For the sum of three vectors 

D = A+B + C, 

illustrated in Fig. 1.2, we first add A and B: 

A+B = E. 


Then this sum is added to C: 


D = E + C. 

Alternatively, we may first add B and C: 


B + C = F. 
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Figure 1.3 

Equilibrium of Forces: 
Fi + F 2 = -F„ 



Then 


D = A + F. 

In terms of the original expression, 

(A + B) + C = A+(B + C) 

so that these alternative ways of summing three vectors lead to the same vector, 

or vector addition is associative. 

A direct physical example of the parallelogram addition law is provided 
by a weight suspended by two cords in Fig. 1.3. If the junction point is in 
equilibrium, the vector sum of the two forces Fi and F 2 must cancel the down¬ 
ward force of gravity, F 3 . Here, the parallelogram addition law is subject to 
immediate experimental verification . 1 Such a balance of forces is of immense 
importance for the stability of buildings, bridges, airplanes in flight, etc. 

Subtraction is handled by defining the negative of a vector as a vector of 
the same magnitude but with reversed direction. Then 

A — B = A + (—B). 

The graphical representation of vector A by an arrow suggests using co¬ 
ordinates as a second possibility. Arrow A (Fig. 1.4), starting from the 


Strictly speaking, the parallelogram addition was introduced as a definition. Experiments show 
that forces are vector quantities that are combined by parallelogram addition, as required by the 
equilibrium condition of zero resultant force. 
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Figure 1.4 

Components and 
Direction Cosines of A 



origin, 2 terminates at the point (A,,, A y , A,). Thus, if we agree that the vec¬ 
tor is to start at the origin, the positive end may be specified by giving the 
rectangular or Cartesian coordinates (A x , A y , A z ) of the arrow head. 

Although A could have represented any vector quantity (momentum, elec¬ 
tric held, etc.), one particularly important vector quantity, the distance from 
the origin to the point ( x, y, z), is denoted by the special symbol r. We then 
have a choice of referring to the displacement as either the vector r or the 
collection (x, y, z), the coordinates of its end point: 

r ** Or, y, z ). (1.3) 

Defining the magnitude r of vector r as its geometrical length, we find that Fig. 
1.4 shows that the end point coordinates and the magnitude are related by 

x = r cos a, y = r cos/l, 2 =rcosy. (1.4) 

cos a, cos /l, and cos y are called the direction cosines, where a is the angle 
between the given vector and the positive .r-axis, and so on. The (Cartesian) 
components A x , A y , and A z can also be viewed as the projections of A on the 
respective axes. 

Thus, any vector A may be resolved into its components (or projected 
onto the coordinate axes) to yield A x = A cos a, etc., as in Eq. (1.4). We refer 
to the vector as a single quantity A or to its components (A x , A y , A 2 ). Note 
that the subscript x in A x denotes the x component and not a dependence on 
the variable x. The choice between using A or its components (A,,, A y , A z ) is 


2 We could start from any point; we choose the origin for simplicity. This freedom of shifting the 
origin of the coordinate system without affecting the geometry is called translation invariance. 
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essentially a choice between a geometric or an algebraic representation. 
The geometric “arrow in space” often aids in visualization. The algebraic set 
of components is usually more suitable for precise numerical or algebraic 
calculations. (This is illustrated in Examples 1.1.1-1.1.3 and also applies to 
Exercises 1.1.1, 1.1.3, 1.1.5, and 1.1.6.) 

Vectors enter physics in two distinct forms: 

• Vector A may represent a single force acting at a single point. The force of 
gravity acting at the center of gravity illustrates this form. 

• Vector A may be defined over some extended region; that is, A and its 
components may be functions of position: A x — A x (pc, y, z), and so on. 

Imagine a vector A attached to each point ( x , y, z), whose length and direction 
change with position. Examples include the velocity of air around the wing of 
a plane in flight varying from point to point and electric and magnetic fields 
(made visible by iron filings). Thus, vectors defined at each point of a region 
are usually characterized as a vector field. The concept of the vector defined 
over a region and being a function of position will be extremely important in 
Section 1.2 and in later sections in which we differentiate and integrate vectors. 

A unit vector has length 1 and may point in any direction. Coordinate 
unit vectors are implicit in the projection of A onto the coordinate axes to 
define its Cartesian components. Now, we define x explicitly as a vector of unit 
magnitude pointing in the positive x-direction, y as a vector of unit magnitude 
in the positive {/-direction, and z as a vector of unit magnitude in the positive 21 - 
direction. Then x/1, is a vector with magnitude equal to A x and in the positive 
.(-direction; that is, the projection of A onto the x-direction, etc. By vector 
addition 


A = x/1,- + y A y + z A z , 


(1.5) 


which states that a vector equals the vector sum of its components or projec¬ 
tions. Note that if A vanishes, all of its components must vanish individually; 
that is, if 


A = 0, then A x = A y = A z — 0. 
Finally, by the Pythagorean theorem, the length of vector A is 



( 1 . 6 ) 


This resolution of a vector into its components can be carried out in a variety 
of coordinate systems, as demonstrated in Chapter 2. Here, we restrict our¬ 
selves to Cartesian coordinates, where the unit vectors have the coordinates 
x = (1, 0, 0), y = (0, 1, 0), and z = (0, 0, 1). 

Equation (1.5) means that the three unit vectors x, y, and z span the real 
three-dimensional space: Any constant vector may be written as a linear com¬ 
bination of x, y, and z. Since x, y, and z are linearly independent (no one is 
a linear combination of the other two), they form a basis for the real three- 
dimensional space. 
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EXAMPLE 1.1.1 


EXAMPLE 1.1.2 


Complementary to the geometrical technique, algebraic addition and sub¬ 
traction of vectors may now be carried out in terms of their components. For 

A = xA t + y A y + z A z and B = xB r: + yB y + z B z , 

A ± B = x(A x ± B x ) + y(A y ± By) + z (A z ± B z ). (1.7) 


Biographical Data 

Descartes, Rene. Descartes, a French mathematician and philosopher, 
was bom in La Haye, France, in 1596 and died in Stockholm, Sweden, in 
1650. Cartesius is the latinized version of his name at a time when Latin was 
the language of sciences, although he mainly wrote in French. He discovered 
his love of mathematics in the army, when he had plenty of time for research. 
He introduced the concept of rectangular coordinates, thereby converting 
geometry to algebraic equations of the coordinates of points, now called 
analytic geometry. Thus, he paved the way for Newton’s and Leibniz’s calcu¬ 
lus. He coined the phrase “Cogito, ergo sum,” which translates to “I think, 
therefore I am.” 


Let 

A = 6x + 4y + 3z 
B = 2x - 3y — 3z. 

Then by Eq. (1.7) 

A + B = (6 + 2)x + (4 - 3)y + (3 - 3)z = 8x + y, 

A —B = (6 —2)x + (4 + 3)y+(3 + 3)z = 4x + 7y + 6z. ■ 

Parallelogram of Forces Find the sum of two forces a and b. To practice 
the geometric meaning of vector addition and subtraction, consider two forces 

a = (3,0,1), b = (4,1, 2) 

(in units of newtons, 1N = 1 kgrn/s 2 , in the Standard International system of 
units) that span a parallelogram with the diagonals forming the sum 

a + b = (3 + 4,1,1 + 2) = (7, 1, 3) = b + a, 

and the difference 

b - a = (4 - 3, 1, 2 - 1) = (1, 1, 1), 
as shown in Fig. 1.5. The midpoint c is half the sum, 

1, .. /7 1 3\ 

2 ^ J \2’2’2) 
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Alternately, to obtain the midpoint from a, add half of the second diagonal that 
points from a to b; that is, 


1 

a+ -(b-a) = 


1 

-(a + b) = c = 


7 1 3 
2 ’ 2 ’ 2 


EXAMPLE 1.1.3 


Center of Mass of Three Points at the Corners of a Triangle Consider 
each comer of a triangle to have a unit of mass and to be located a* from the 
origin, where 


a i — (2, 0, 0), a.2 — (4, 1, 1), a3 — (3, 3, 2). 

Then, the center of mass of the triangle is 

-(aj + a 2 + a 3 ) = c = -(2 + 4 + 3,1 + 3,1 + 2)= ^3, —, 1 j . 


THEOREM 1.1 


If we draw a straight line from each corner to the midpoint of the opposite 
side of the triangle in Fig. 1.6, these lines meet in the center, which is at 
a distance of two-thirds of the line length to the comer. 


The three midpoints are located at the point of the vectors 
-(a 1 + a 2 ) = -(2 + 4, 1, 1) = (a, 

2 ( a 2 + a 3 ) = -(4 + 3, 1 + 3, 1 + 2) = ^-, 2, 
-(a 3 + aO = -(3 + 2, 3,2)=Q > |,1 ). 
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Figure 1.6 

Center of a Triangle. 
The Dashed Line 
Goes from the Origin 
to the Midpoint of a 
Triangle Side, and 
the Dotted Lines Go 
from Each Corner to 
the Midpoint of the 
Opposite Triangle 
Side 



To prove this theorem numerically or symbolically using general vectors, we 
start from each comer and end up in the center as follows: 


(2, 0, 0) 


(4, 1, 1) 


(3, 3, 2) 


7 3 , 

2 , - - ( 2 , 0 , 0 ) 


a i + jj Q(a 2 + a 3 )-ai 


5 3 
2 ’ 2 ’ 
2 /I 


(4, 1, 1) 


a 2 + — I — (ai + a 3 ) — a 2 


3, ) - (3, 3, 2) 


a 3 + — f - (ai + a 2 ) — a 3 



—( a i + a 2 + a 3 ), 



—(a 2 + a 2 + a 3 ), 



—( a i + a 2 + a 3 ). 


This theorem is easy to establish using vector algebra, but it is much more 
tedious to prove working only with intersecting straight lines of Euclidean 
geometry. Thus, this example is not only useful practice but also lets us appre¬ 
ciate the power and versatility of the vector algebra. ■ 


EXAMPLE 1.1.4 


Elastic Forces A movable point a is held elastically (denoted by springs in 
Fig. 1.7) by three fixed points a*, i = 1,2, 3; that is, the force F ; = fc,(a, — a) 
for each i that a experiences is pointed to a,- and proportional to the distance. 
Let us show that the elastic forces to three points can be replaced by an elastic 
force to a single point. 

This holds because the total force is given by 


F — ^ ' Fj — ^ ' kiAi a ^ ' hi 

i i i 



( 12i kj&i 
\ 12i^i 



— &o(ao a), 
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Figure 1.7 

The Point a Is Held 
Elastically by Three 
Points a, 



where ko = and a 0 = JV /c,;a,;//co. This shows that the resulting force 

is equivalent to one with an effective spring constant ko acting from a point 
a 0 . Note that if all fc,;’s are the same, then a 0 = '.(ai + a 2 + a 3 ) is the center of 
mass. 

Technical applications apply for bridges and buildings, for which the bal¬ 
ance of forces is vital for stability. ■ 


Vectors and Vector Space Summary 


An ordered triplet of real numbers (x \, .x 2 , Xo) is labeled a vector x. The number 
x n is called the nth component of vector x. The collection of all such vectors 
(obeying the properties that follow) forms a three-dimensional real vector 
space, or linear space. We ascribe five properties to our vectors: If x = 

(%i, % 2 , X 3 ) and y = (?/i, 2 / 2 , 2 / 3 ), 


1. Vector equality: x = y means Xi = y i ,i= 1, 2, 3. 

2. Vector addition: x + y = z means x-,. + vy, = Zi, i = 1,2, 3. 

3. Scalar multiplication: ax = (ax\, a.x 2 , ax^). 

4. Negative of a vector: —x = (—l)x = (— x\, —x 2 , —X 3 ). 

5. Null vector: There exists a null vector 0 = (0, 0, 0). 

Since our vector components are numbers, the following properties also 
hold: 


1. Addition of vectors is commutative: x + y = y + x. 

2. Addition of vectors is associative: (x + y) + z = x + (y + z). 
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SUMMARY 


3. Scalar multiplication is distributive: 

a(x + y) = ax + ay, also (a + ft)x = ax + bx. 

4. Scalar multiplication is associative: (ah)x = a(bx ). 

Furthermore, the null vector 0 is unique, as is the negative of a given vector x. 

With regard to the vectors, this approach merely formalizes the component 
discussion of Section 1.1. The importance lies in the extensions, which will 
be considered later. In Chapter 3, we show that vectors form a linear space, 
with the transformations in the linear space described by matrices. Finally, 
and perhaps most important, for advanced physics the concept of vectors 
presented here generalizes to (i) complex quantities, 3 (ii) functions, and (iii) an 
infinite number of components. This leads to infinite dimensional function 
spaces, the Hilbert spaces, which are important in quantum mechanics. A brief 
introduction to function expansions and Hilbert space is provided in Chapter 9. 


So far, we have defined the operations of addition and subtraction of vectors 
guided by the use of elastic and gravitational forces in classical mechanics, set 
up mechanical and geometrical problems such as finding the center of mass of 
a system of mass points, and solved these problems using the tools of vector 
algebra. 

Next, we address three varieties of multiplication defined on the basis of 
their applicability in geometry and mechanics: a scalar or inner product in 
Section 1.2; a vector product peculiar to three-dimensional space in Section 
1.3, for which the angular momentum in mechanics is a prime example; and a 
direct or outer product yielding a second-rank tensor in Section 2.7. Division 
by a vector cannot be consistently defined. 

EXERCISES 

1.1.1 A jet plane is flying eastward from Kennedy Airport at a constant speed 
of 500 mph. There is a crosswind from the south at 50 mph. What is the 
resultant speed of the plane relative to the ground? Draw the velocities 
(using graphical software, if available). 

1 . 1.2 A boat travels straight across a river at a speed of 5 mph when there is 
no current. You want to go straight across the river in that boat when 
there is a constant current flowing at 1 mph. At what angle do you have 
to steer the boat? Plot the velocities. 

1 . 1.3 A sphere of radius a is centered at a point iq. 

(a) Write out the algebraic equation for the sphere. Explain in words why 
you chose a particular form. Name theorems from geometry you may 
have used. 


3 The w-dimensional vector space of real w-tuples is often labeled R K , and the Jirdimensional vector 
space of complex n-tuples is labeled C n . 
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(b) Write out a vector equation for the sphere. Identify in words what 
you are doing. 

ANS. (a) (x - x\f + (y- y{f + (z- z{f = a 2 . 

(b) r = ri + a (a takes on all directions but has a fixed 
magnitude, a). 

1.1.4 Show that the medians of a triangle intersect at a point. Show that this 
point is two-thirds of the way from any corner of the triangle to the mid¬ 
point of the opposite side. Compare a geometrical proof with one using 
vectors. If you use a Cartesian coordinate system, place your triangle so 
as to simplify the analysis as much as possible. Explain in words why 
you are allowed to do so. 

1.1.5 The velocity of sailboat A relative to sailboat B, v re i, is defined by the 
equation v re i = v.i — vg, where v,,i is the velocity of A and v« is the 
velocity of B. Determine the velocity of A relative to B if 

V/i = 30 km/hr east 
V/j = 40 km/hr north. 

Plot the velocities (using graphical software, if available). 

ANS. v re i = 50 km/hr, 53.1° south of east. 

1.1.6 A sailboat sails for 1 hr at 4 km/hr (relative to the water) on a steady 
compass heading of 40° east of north. The sailboat is simultaneously 
carried along by a current. At the end of the hour the boat is 6.12 km 
from its starting point. The line from its starting point to its location lies 
60° east of north. Find the x (easterly) and y (northerly) components of 
the water’s velocity. Plot all velocities. 

ANS. r eas t = 2.73 km/hr, r n0 rth ~ 0 km/hr. 

1.1.7 A triangle is defined by the vertices of three vectors, A, B, and C, that 
extend from the origin. In terms of A, B, and C, show that the vector 
sum of the successive sides of the triangle is zero. If software is available, 
plot a typical case. 

1.1.8 Find the diagonal vectors of a unit cube with one corner at the origin and 
three adjacent sides lying along the three axes of a Cartesian coordinate 
system. Show that there are four diagonals with length V3. Representing 
these as vectors, what are their components? Show that the diagonals of 
the cube’s surfaces have length \/2. Determine their components. 

1.1.9 Hubble’s law: Hubble found that distant galaxies are receding with a 
velocity proportional to their distance (H 0 is the Hubble constant) from 
where we are on Earth. For the 7th galaxy 

v,; = /7or,, 

with our Milky Way galaxy at the origin. Show that this recession of the 
galaxies from us does not imply that we are at the center of the universe. 
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Specifically, take the galaxy at ri as a new origin and show that Hubble’s 
law is still obeyed. 


1.2 Scalar or Dot Product 


Having defined vectors, we now proceed to combine them in this section. 
The laws for combining vectors must be mathematically consistent. From the 
possibilities that are consistent we select two that are both mathematically 
and physically interesting. In this section, we start with the scalar product that 
is based on the geometric concept of projection that we used in Section 1.1 
to define the Cartesian components of a vector. Also included here are some 
applications to particle orbits and analytic geometry that will prompt us to 
differentiate vectors, thus starting vector analysis. 

The projection of a vector A onto a coordinate axis, which defines its 
Cartesian components in Eq. (1.5), is a special case of the scalar product 
of A and the coordinate unit vectors, 

A x = A cos a = A ■ x, A y = A cos j3 = A ■ y, A z = A cos y = A ■ z (1.8) 

and leads us to the general definition of the dot product. Just as the projection 
is linear in A, we want the scalar product of two vectors to be linear in A and 
B—that is, to obey the distributive and associative laws 

A-(B + C) = A- B + A- C (1.9) 

A ■ (?/B) = (i/A) • B = yA ■ B, (1.10) 

where y is a real number. Now we can use the decomposition of B into its 
Cartesian components according to Eq. (1.5), B = B x x + B y y + B z z, to con¬ 
struct the general scalar or dot product of the vectors A and B from the special 
case as 

A B = A (B x x + By y + B z z), 

= B J: A ■ x + By A ■ y + B z A ■ z, applying Eqs. (1.9) and (1.10) 

= B X A X + B y Ay + B Z A Z , upon substituting Eq. (1.8). 

Hence, 

A • B = ^ AiBi = ^2 B >Ai = B • A (1.11) 

i i 

because we are dealing with components. 

If A = B in Eq. (1.11), we recover the magnitude A = QT Af) 1 ''- of A in 
Eq. (1.6) from Eq. (1.11). 

It is obvious from Eq. (1.11) that the scalar product treats A and B alike, 
is symmetric in A and B, or is commutative. Based on this observation, we 
can generalize Eq. (1.8) to the projection of A onto an arbitrary vector B ^ 0 
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Figure 1.8 

Scalar Product 
A - B = ABcosQ 



Figure 1.9 

The Distributive Law 
A- (B + C) =AB a + 
AC a =A(B + C)^ 
[Eq. (1.9)] 



instead of the coordinate unit vectors. As a first step in this direction, we define 
Ab as Ab = A cos 6 = A B, where B = B/F? is the unit vector in the direction 
of B and 6 is the angle between A and B as shown in Fig. 1.8. Similarly, we 
project B onto A as B A = B cos 0 = B • A. These projections are not symmetric 
in A and B. To make them symmetric in A and B, we define 


A • B = A b B — AB a — ABcosO. (1.12) 

The distributive law in Eq. (1.9) is illustrated in Fig. 1.9, which states that 
the sum of the projections of B and C onto A, B,\ + C ,\, is equal to the projection 
of B + C onto A, (B + C)a- 

From Eqs. (1.8), (1.11), and (1.12), we infer that the coordinate unit vectors 
satisfy the relations 


x-x = y-y = z-z=l, (1.13) 

whereas 


A A A A /\ A /-> 

x-y = x- z = y- z = 0. 


(1.14) 
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If the component definition of the dot product, Eq. (1.11), is labeled an al¬ 
gebraic definition, then Eq. (1.12) is a geometric definition. One of the most 
common applications of the scalar product in physics is in the definition of 
work = force • displacement • cos 0, where 0 is the angle between the force 
and the displacement. This expression is interpreted as displacement times the 
projection of the force along the displacement direction—that is, the scalar 
product of force and displacement, W = F ■ s. 

If A ■ B = 0 and we know that A ^ 0 and B ^ 0, then from Eq. (1.12) cos 0 = 
0 or 0 = 90°, 270°, and so on. The vectors A and B must be perpendicular. 
Alternately, we may say A and B are orthogonal. The unit vectors x, y, and z 
are mutually orthogonal. 


Free Motion and Other Orbits 


EXAMPLE 1.2.1 


Free Particle Motion To apply this notion of orthogonality in two dimen¬ 
sions, let us first deal with the motion of a particle free of forces along a straight 
line 


r(f) = (x(t\ y(t )) = (-3 1, 4Q 

through the origin (dashed line in Fig. 1.10). The particle travels with the 
velocity v x = x/t = —3 in t he ^-direction and v y = y/t = 4 in the ^direction (in 
meters per second; e.g., 1 m/sec = 3.6 km/hr). The constant velocity v = (—3, 4) 
is characteristic of free motion according to Newton’s equations. 

Eliminating the time t, we find the homogeneous linear equation 4x+3y = 
0, whose coefficient vector (4, 3) we normalize to unit length; that is, we write 
the linear equation as 

4 3 

-x + -y — 0 = n ■ r. 

5 5 


Figure 1.10 

The Dashed Line Is 
n • r = 0 and the 
Solid Line Is 
n-r = d 
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where n = (4/5, 3/5) is a constant unit vector and r is the coordinate vector 
varying in the xy-plane; that is, r = xx + yy. The scalar product 

n • r = 0 (1.15) 

means that the projection onto n of the vector r(t) pointing from the origin 
(a point on the line) to the general point on the line is zero so that n is the 

normal of the straight line. We verify that 

(-3 1, 40 - jp = ^(—3 -4 + 4- 3) = 0. 

Because the particle’s velocity is a tangent vector of the line, we can also write 
the scalar product as v • n = 0, omitting the normalization factor t/v — t/ 5. 

If we throw the particle from the origin in an arbitrary direction with some 
velocity a, it also will travel on a line through the origin. That is, upon varying 
the normal unit vector the linear Eq. (1.15) defines an arbitrary straight line 
through the origin in the xy-plane. Notice that in three dimensions Eq. (1.15) 
describes a plane through the origin, and a hyperplane ((n — l)-dimensional 
subspace) in w-dimensional space. 

Now we shift the line by some constant distance d along the normal di¬ 
rection n so that it passes through the point (3, 0) on the x-axis, for example. 
Because its tangent vector is v, the line is parameterized as x(t) = 3—3 1, y(t) = 
At. We can verify that it passes through the point r 2 = (3, 0) on the x-axis for 
t = 0 and ri = (0, 4) on the y-axis for t = 1. The particle has the same velocity 
and the path has the same normal. Eliminating the time as before, we find that 
the linear equation for the fine becomes Ax + 3y = 12, or 

12 

n r = d = —. (1.16) 

5 

The line no longer goes through the origin (solid line in Fig. 1.10) but has 
the shortest distance d = 12/5 from the origin. If ri = (0, 4), r 2 = (3, 0) 
are our different points on that line, then T = ri — r 2 = (—3, 4) = v is a 
tangent vector of the line and therefore orthogonal to the normal n because 
n ■ T = n • ri — n r 2 = d — d = 0 from Eq. (1.16). Then the general point on 
that line is parameterized by 


r (0 = n + «T 


(1.17) 


because nr = nr 1 + tn-T = d + t0 = d. 

Note that in general a straight fine is defined by a linear relation, n ■ r = d, 
and its points depend linearly on one variable t; that is, in two dimensions 
Eq. (1.17) represents x = x\ + tT x , y — y\ + tT y , with T = (T x , T y ). The 
geometry of Fig. 1.10 shows that the projection of the vectors ri, r 2 , r on the 
normal n is always d —that is, the shortest distance of our line from the origin, 
consistent with the algebraic definition n • r = d of our line, from which we 
started. 
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Equations (1.16) and (1.17) are consistent with the conventional definition 
of a straight line by its constant slope (or angle a with the x-axis) 

tana = ——— (x — Xi)sina — (y — t/i)cosa = 0, (1.18) 

X — X\ 

where the normal n = (sin a, — cos a); upon comparing Eq. (1.18) with Eq. 
(1.16), n • r = d — X\ sina — y\ cosa. 

Generalizing to three-dimensional analytic geometry, n • r = d is linear in 
the variables (x, y, z) — r; that is, it represents a plane, and the unit vector 
n = (ri\ , n 2 , ri:\) is perpendicular to the plane—it is the constant normal of 
the plane. If we divide the plane equation by d, the coefficients ni/d of the 
coordinates x, of the plane give the inverse lengths of the segments from the 
origin to the intersection of the Cartesian axes with the plane. For example, 
the point of the plane 6a; + 3y + 2z = 6 in Fig. 1.11 on the x-axis defined by 
y — 0 = z is (d/n,\ — 1, 0, 0) for n\ = 6/7, d — 6/7, noting that 6 2 + 3 2 + 2 2 = 7 2 . 
The general point on the plane is parameterized as 


r(s, f) = ri + sli + fl 2 , 


where s and t are parameters, and it is constructed from three of its points 
r i, i = 1, 2, 3, that is, ri = (1, 0, 0), r 2 = (0, 2, 0), r 3 = (0, 0, 3) for the plane in 
Fig. 1.11, so that the tangent vectors li = r 2 — ri, 1 2 = r 3 — ri of the plane are 
not parallel. All this generalizes to higher dimensions. 

Geometry also tells us that two nonparallel planes ai ■ r = d\, a 2 • r = d 2 in 
three-dimensional space have a line in common and three nonparallel planes 
a single point in general. Finding them amounts to solving linear equations, 
which is addressed in Section 3.1 using determinants. 


Figure 1.11 

The Plane 

6x + Sy + Iz = 6 
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Figure 1.12 

Differentiation of a 
Vector 



More generally, the orbit of a particle or a curve in planar analytic geometry 
may be defined as r(£) = (,x(t), y(tj), where x and y are functions of the 
parameter t. In order to find the slope of a curve or the tangent to an orbit 
we need to differentiate vectors. Differentiating a vector function is a simple 
extension of differentiating scalar functions if we resolve r(£) into its Cartesian 
components. Then, for differentiation with respect to time, the linear velocity 
is given by 


dr(t.) 

dt 


r(£ + A£) — r(f) 
lim -= v = 

At—>0 At 


/ dx dy dz\ 

V. dt. ’ dt’ dt) 


= (x, y, z) 


because the Cartesian unit vectors are constant. Thus, differentiation of a 
vector always reduces directly to a vector sum of not more than three (for 
three-dimensional space) scalar derivatives. In other coordinate systems (see 
Chapter 2), the situation is more complicated because the unit vectors are 
no longer constant in direction. Differentiation with respect to the space co¬ 
ordinates is handled in the same way as differentiation with respect to time. 
Graphically, we have the slope of a curve, orbit, or trajectory, as shown in 
Fig. 1.12. ■ 


EXAMPLE 1.2.2 


Shortest Distance of a Rocket from an Observer What is the shortest 
distance of a rocket traveling at a constant velocity v = (1, 2, 3) from an 
observer at r 0 = (2, 1, 3)? The rocket is launched at time t = 0 at the point 
n = (l, l, l). 

The path of the rocket is the straight line 


r = ri + £v, 

or, in Cartesian coordinates, 


(1.19) 


x(t) = l + t, y(t) — 1 + 2t, z(t) = l + 3t. 
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We now minimize the distance |r — ro | of the observer at the point ro = (2, 1,3) 
from r(£), or equivalently (r — ro) 2 = min. Differentiating Eq. (1.19) with 
respect to t yields r = y, z) = v. Setting ^(r — r 0 ) 2 = 0, we obtain the 
condition 


2(r - r 0 ) ■ r = 2[ri - r 0 + tv] ■ v = 0. 


Because r = v is the tangent vector of the line, the geometric meaning of this 
condition is that the shortest distance vector through r 0 is perpendicular 
to the line. Now solving for t yields the ratio of scalar products 

(n-ro)-v (-1,0,-2). (1,2,3) 1 + 0 + 6 1 

v 2 (1,2,3) • (1,2,3) 1 + 4 + 9 2' 

Substituting this parameter value into Eq. (1.19) gives the point r s = (3/2, 2, 
5/2) on the line that is closest to ro. The shortest distance is d = |ro — r s | = 

|(-l/2, 1,-1/2)| = V274+T = V3/2. ■ 

In two dimensions, r(£) = (x = a cos t, y = b sin t) describes an ellipse 
with half-axes a, b (so that a — b gives a circle); for example, the orbit of a 
planet around the sun in a plane determined by the constant orbital angular 
momentum (the normal of the plane). If r 0 = (a; (to), y (to)) = (•+), yo) = r(t 0 ) 
is a point on our orbit, then the tangent at r 0 has the slope yo/xo, where the 
dots denote the derivatives with respect to the time t as usual. Returning to the 
slope formula, imagine inverting x = x (t) to find f = t(.r), which is substituted 
into y = y(t) = y(t,(xj) = f{pc) to produce the standard form of a curve in 
analytic geometry. Using the chain rule of differentiation, the slope of / (x) 
atxis 


df _ , _ dy(t(x)) _dydt L _y 

dx dx dt dx x 

The tangent is a straight line and therefore depends linearly on one variable u, 

r = r (t 0 ) + ur (£ 0 ), (1.20) 

whereas the normal through the given point [xq, yf) obeys the linear equation 

(pc-Xo)xo + iy- m)yo = 0. ( 1 . 21 ) 

For the elliptical orbit mentioned previously, we check that the point ro = 
(0, b) for the parameter value t — jr/2 lies on it. The slope at f = it /2 is 
zero, which we also know from geometry and because y 0 — b cos t\ t = n /2 — 0, 
whereas xq = —o sin /1^/2 = —a f 0. The normal is the y -axis for which Eq. 
(1.21) yields — ax = 0. 

A curve can also be defined implicitly by a functional relation, F(x, y) — 0, 
of the coordinates. This common case will be addressed in Section 1.5 because 
it involves partial derivatives. 
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SUMMARY 



Law of Cosines In a similar geometrical approach, we take C = A + B and 
dot it into itself: 

C • C = (A + B) • (A + B) = A ■ A + B ■ B + 2A • B. (1.22) 

Since 

C ■ C = C 2 , (1.23) 

the square of the magnitude of vector C is a scalar, we see that 

A B = 1 (C 2 -A 2 -B 2 ) (1.24) 

2 

is a scalar. Note that since the right-hand side of Eq. (1.24) is a scalar, the 
left-hand side A ■ B must also be a scalar, independent of the orientation of the 
coordinate system. We defer a proof that a scalar product is invariant under 
rotations to Section 2.6. 

Equation (1.22) is another form of the law of cosines: 

C 2 = A 2 + B 2 + 2ABcos6. (1.25) 

Comparing Eqs. (1.22) and (1.25), we have another verification of Eq. (1.12) or, 
if preferred, a vector derivation of the law of cosines (Fig. 1.13). This law may 
also be derived from the triangle formed by the point of C and its line of shortest 
distance from the line along A, which has the length /t sin 0, whereas the 
projection of B onto A has length B cos 9. Applying the Pythagorean theorem 
to this triangle with a right angle formed by the point of C, A + B • A and the 
shortest distance B sin 0 gives 

C 2 = (A + B • A) 2 + ( B sin 0) 2 = A 2 + B 2 ( cos 2 6 + sin 2 0) + 2 AB cos 0. 

In this section, we defined the dot product as an algebraic generalization of 
the geometric concept of projection of vectors (their coordinates). We used 
it for geometrical purposes, such as finding the shortest distance of a point 
from a line or the cosine theorem for triangles. The geometrical meaning of 
the scalar product allowed us to go back and forth between the algebraic 
definition of a straight line as a linear equation and the parameterization of 
its general point r(f) as a linear function of time and similar steps for planes 
in three dimensions. We began differentiation of vectors as a tool for drawing 
tangents to orbits of particles, and this important step represents the start of 
vector analysis enlarging vector algebra. 
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The dot product, given by Eq. (1.11), may be generalized in two ways. 
The space need not be restricted to three dimensions. In n-dimensional space, 
Eq. (1.11) applies with the sum running from 1 to n; n may be infinity, with the 
sum then a convergent infinite series (see Section 5.2). The other generalization 
extends the concept of vector to embrace functions. The function analog of a 
dot or inner product is discussed in Section 9.4. 

EXERCISES 

1.2.1 A car is moving northward with a constant speed of 50 mph for 5 min, 
and then makes a 45° turn to the east and continues at 55 mph for 1 min. 
What is the average acceleration of the car? 

1.2.2 A particle in an orbit is located at the point r (drawn from the origin) that 
terminates at and specifies the point in space (x, y, z). Find the surface 
swept out by the tip of r and draw it using graphical software if 

(a) (r — a) • a = 0, 

(b) (r — a) • r = 0. 

The vector a is a constant (in magnitude and direction). 

1.2.3 Develop a condition when two forces are parallel, with and without using 
their Cartesian coordinates. 

1.2.4 The Newtonian equations of motion of two particles are 

TOiVi = F\ + Ff, m 2 v 2 = F?; + F|, 

where to* are their masses, v, are their velocities, and the superscripts 
on the forces denote internal and external forces. What is the total force 
and the total external force? Write Newton’s third law for the forces. 
Define the center of mass and derive its equation of motion. Define the 
relative coordinate vector of the particles and derive the relevant equa¬ 
tion of motion. Plot a typical case at some time instant using graphical 
software. 

Note. The resultant of all forces acting on particle 1, whose origin lies 
outside the system, is called external force F®; the force arising from the 
interaction of particle 2 with particle 1 is called the internal force F, so 
that Fj = -F*. 

1.2.5 If |A|, |B| are the magnitudes of the vectors A., B, show that —|A.| |B| ^ 
A B < |A||B|. 


1.3 Vector or Cross Product 


A second form of vector multiplication employs the sine of the included angle 
(denoted by 0) instead of the cosine and is called cross product. The cross 
product generates a vector from two vectors, in contrast with the dot product, 
which produces a scalar. Applications of the cross product in analytic geometry 
and mechanics are also discussed in this section. For instance, the orbital 
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Figure 1.14 
Angular Momentum 


y 



angular momentum of a particle shown at the point of the distance vector in 
Fig. 1.14 is defined as 

Angular momentum = radius arm ■ linear momentum 

= distance - linear momentum • sin 6 . (1.26) 

For convenience in treating problems relating to quantities such as angular 
momentum, torque, angular velocity, and area, we define the vector or cross 
product as 

C = A x B, (1.27) 

with the magnitude (but not necessarily the dimensions of length) 

C = AB sind. (1.28) 

Unlike the preceding case of the scalar product, C is now a vector, and we 
assign it a direction perpendicular to the plane of A and B such that A, B, and 
C form a right-handed system. If we curl the fingers of the right hand from 
the point of A to B, then the extended thumb will point in the direction of 
A x B, and these three vectors form a right-handed system. With this choice 
of direction, we have 

A x B = —B x A, anticommutation. (1.29) 

In general, the cross product of two collinear vectors is zero so that 

A A A A A A a 

xxx = yxy = zxz = 0, 

whereas 

AAA AAA AAA 

x x y = z, y x z = x, z x x = y, 

A A AAA AAA A 

y x x = —z, z x y = —x, x x z = —y. 

Among the examples of the cross product in mathematical physics are the 
relation between linear momentum p and angular momentum L (defining 
angular momentum), 


(1.30) 

(1.31) 


L = r x p 


(1.32) 
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Figure 1.15 

Parallelogram 
Representation of 
the Vector Product 



and the relation between linear velocity v and angular velocity us, 

v = uj x r. (1.33) 

Vectors v and p describe properties of the particle or physical system. How¬ 
ever, the position vector r is determined by the choice of the origin of the 
coordinates. This means that tu and L depend on the choice of the origin. 

The familiar magnetic induction B occurs in the vector product force 
equation called Lorentz force 


F M = ?vxB (SI units), (1-34) 

where v is the velocity of the electric charge q , and F v? is the resulting magnetic 
force on the moving charge. The cross product has an important geometrical 
interpretation that we shall use in subsequent sections. In the parallelogram 
(Fig. 1.15) defined by A and B, B sin 6 is the height if A is taken as the length 
of the base. Then |A x B| = AB sin 6 is the area of the parallelogram. As a 
vector, A x B is the area of the parallelogram defined by A and B, with the area 
vector normal to the plane of the parallelogram. This means that area (with its 
orientation in space) is treated as a vector. 

An alternate definition of the vector product can be derived from the special 
case of the coordinate unit vectors in Eqs. (1.30) and (1.31) in conjunction with 
the linearity of the cross product in both vector arguments, in analogy with 
Eqs. (1.9) and (1.10) for the dot product, 


Ax(B + C) = AxB + AxC, 

(1.35) 

(A + B)xC = AxC + BxC, 

(1.36) 

A x (?/B) = yA x B = (?/A) x B, 

(1.37) 
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where y is a number, a scalar. Using the decomposition of A and B into their 
Cartesian components according to Eq. (1.5), we find 

A x B = C = (C x , C y , C z ) = (A x x + A y y + A z z) x (B x ± + B y y + B z z) 

= (A,, /l f/ - A y B x )x x y + {A X B Z - A Z B X )± x z 
+(A/U - A z By)y x z, 

upon applying Eqs. (1.35) and (1.37) and substituting Eqs. (1.30) and (1.31) so 
that the Cartesian components of A x B become 

C x — A y B z A z B y , Cy — A Z B X A X B Z , C z = A x B y AyB x , (1.38) 


or 


C'i = AjB k - A k Bj, i, j, k all different, (1.39) 


and with cyclic permutation of the indices i, j, and k or x -> y z -> a; in 
Eq. (1.38). The vector product C may be represented by a determinant 4 


X 

y 

z 

Ax 

Ay 

A z 

B x 

By 

B z 


(1.40) 


which, according to the expansion Eq. (3.11) of the determinant along the top 
row, is a shorthand form of the vector product 


C — x( It- A z By ) + y (A Z B X A X B Z ) + z [A x By AyB iX ). 

If Eqs. (1.27) and (1.28) are called a geometric definition of the vector product, 
then Eq. (1.38) is an algebraic definition. 

To show the equivalence of Eqs. (1.27) and (1.28) and the component defi¬ 
nition Eq. (1.38), let us form A • C and B ■ C using Eq. (1.38). We have 


A • C = A ■ (A x B) 

— A x [AyB z A z B y ) + A y (A z B x A , [> : ) A z [A x B y -1 id,) 

= 0. (1.41) 


Similarly, 


B • C = B • (A x B) = 0. (1.42) 

Equations (1.41) and (1.42) show that C is perpendicular to both A and B and 
therefore perpendicular to the plane they determine. The positive direction is 
determined by considering special cases, such as the unit vectors x x y = z. 


determinants are discussed in detail in Section 3.1. 
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The magnitude is obtained from 

(A x B) ■ (A x B) = A 2 B 2 - (A • B) 2 

= A 2 B 2 - A 2 B 2 cos 2 9 
= A 2 B 2 sin 2 9, (1.43) 

which implies Eq. (1.28). The first step in Eq. (1.43) may be verified by expand¬ 
ing out in component form using Eq. (1.38) for A x B and Eq. (1.11) for the dot 
product. From Eqs. (1.41)—(1.43), we see the equivalence of Eqs. (1.28) and 
(1.38), the two definitions of vector product. 


EXAMPLE 1.3.1 


Shortest Distance between Two Rockets in Free Flight Considering 
Example 1.2.2 as a similar but simpler case, we remember that the shortest 
distance between a point and a line is measured along the normal from the 
line through the point. Therefore, we expect that the shortest distance between 
two lines is normal to both tangent vectors of the straight lines. Establishing 
this fact will be our first and most important step. The second step involves the 
projection of a vector between two points, one on each line, onto that normal 
to both lines. However, we also need to locate the points where the normal 
starts and ends. This problem we address first. 

Let us take the first line from Example 1.2.2, namely r = r 4 + q V] with 
time variable t\ and tangent vector vi = r 2 — ri = (1, 2, 3) that goes through 
the points ri = (1, 1, 1) and r 2 = (2, 3, 4) and is shown in Fig. 1.16, along 
with the second line r = r 3 + t 2 v 2 with time variable t 2 that goes through 
the points r 3 = (5, 2, 1) and r 4 = (4, 1, 2), and so has the tangent vector 
r 4 — r ,3 = (—1, —1, 1) = v 2 and the parameterization 


x=5 - f 2 , y=2 - t 2 , z—l + t 2 . 


Figure 1.16 

Shortest Distance 
Between Two 
Straight Lines That 
Do Not Intersect 
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In order to find the end points ro/ c of this shortest distance we minimize the 
distances squared (r — ro/c) 2 to obtain the conditions 

0 = ~ r 02 ) 2 = - r 02 + tivO 2 = 2vi • (ri - r 02 + fivO, 

ati at\ 

o = -^-(r - roi) 2 = 2v 2 • (r 3 - r 0i + fev 2 ). (1.44) 

dti 

We can solve for t\ = —vi • (ri — ro 2 )/v 2 and t 2 = —v 2 • (r 3 — r 0 i)/v| and 
then plug these parameter values into the line coordinates to find the points 
ro* and d = |roi — ro 2 |. This is straightforward but tedious. Alternatively, we 
can exploit the geometric meaning of Eq. (1.44) that the distance vector d = 
ri + bV| — r 02 = —(r 3 + t 2 v 2 — r 0 i) is perpendicular to both tangent vectors 
Vfc as shown in Fig. 1.16. Thus, the distance vector d is along the normal unit 
vector 


Vi X v 2 
|Vl X v 2 | 


1 


•v/3\/l4 


X 

+1 

-1 


y z 

+2 3 

-1 1 


1 

V42 


(5x — 4y + z) 


1 

V42 


(5, -4, 1), 


the cross product of both tangent vectors. We get the distance d by projecting 
the distance vector between two points ri, r 3 , one on each line, onto that 
normal n—that is, d = (r 3 - n) ■ n = ^=(4, 1, 0) • (5, -4, 1) = = JL. 

This example generalizes to the shortest distance between two orbits by 
examining the shortest distance beween their tangent lines. In this form, there 
are many applications in mechanics, space travel, and satellite orbits. ■ 


EXAMPLE 1.3.2 


Medians of a Triangle Meet in the Center Let us consider Example 1.1.3 
and Fig. 1.6 again, but now without using the 2:1 ratio of the segments from the 
center to the end points of each median. We put the origin of the coordinate 
system in one comer of the triangle, as shown in Fig. 1.17, so that the median 
from the origin will be given by the vector m 3 = (ax + a 2 )/2. The medians 


Figure 1.17 

Medians of a 
Triangle Meet in the 
Center 
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SUMMARY 


from the triangle corners ai and a 2 intersect at a point we call the center that 
is given by the vector c from the origin. We want to show that m3 and c are 
parallel (and therefore collinear), indicating that the center will also lie on the 
median from the origin. 

From Fig. 1.17, we see that the vector c a, from the comer a! to the center 
will be parallel to 3 a 2 — ap similarly, c — a2 will be collinear with (a.; — a 2 . We 
write these conditions as follows: 

= 0, (c - a 2 ) x Qa! - a 2 ^j = 0. 

Expanding, and using the fact that ai x ai = 0 = a 2 x a 2 , we find 

c x ^a 2 - c x ai - ^(ai x a 2 ) = 0, c x ^ai - c x a 2 - ^(a 2 x ai) = 0. 

Ci Ci Ci Ci 

Adding these equations, the last terms on the left-hand sides cancel, and the 
other terms combine to yield 

— ~c x (ai + a 2 ) = 0, 
proving that c and m 3 are parallel. 

The center of mass (see Example 1.1.3) will be at the point |(ai + a 2 ) and 
is therefore on the median from the origin. By symmetry it must be on the 
other medians as well, confirming both that they meet at a point and that the 
distance from the triangle comer to the intersection is two-thirds of the total 
length of the median. ■ 


(c - ai) x ( -a 2 - ai 


If we define a vector as an ordered triplet of numbers (or functions) as in 
Section 1.2, then there is no problem identifying the cross product as a vector. 
The cross product operation maps the two triples A and B into a third triple 
C, which by definition is a vector. In Section 2.6, we shall see that the cross 
product also transforms like a vector. 

The cross product combines two vectors antisymmetrically and involves 
the sine of the angle between the vectors, in contrast to their symmetric 
combination in the scalar product involving the cosine of their angle, and 
it unifies the angular momentum and velocity of mechanics with the area con¬ 
cept of geometry. The vector nature of the cross product is peculiar to three- 
dimensional space, but it can naturally be generalized to higher dimensions. 
The cross product occurs in many applications such as conditions for parallel 
forces or other vectors and the shortest distance between lines or curves more 
generally. 

We now have two ways of multiplying vectors; a third form is discussed 
in Chapter 2. However, what about division by a vector? The ratio B/A is not 
uniquely specified (see Exercise 3.2.21) unless A and B are also required to be 
parallel. Hence, division of one vector by another is meaningless. 
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EXERCISES 

1.3.1 Prove the law of cosines starting from A 2 = (B — C) 2 , where A, B, 
and C are the vectors collinear with the sides of a triangle. Plot the 
triangle and describe the theorem in words. State the analog of the law 
of cosines on the unit sphere (Fig. 1.18), if A, B, and C go from the 
origin to the comers of the triangle. 

1.3.2 A coin with a mass of 2 g rolls on a horizontal plane at a constant 
velocity of 5 cm/sec. What is its kinetic energy? 

Hint. Show that the radius of the coin drops out. 

1.3.3 Starting with C = A + B, show that CxC = 0 leads to 

AxB = —B x A. 


1.3.4 Show that 

(a) (A - B) • (A + B) = A 2 - B 2 , 

(b) (A - B) x (A + B) = 2A x B. 

The distributive laws needed here, 

A-(B + C) = A- B + A- C 


and 


Ax(B + C) = AxB + AxC, 
may be verified by expansion in Cartesian components. 

1.3.5 If P = xP r + yP y and Q = xQ. r + y Q y are any two nonparallel (also 
non-antiparallel) vectors in the xy-plane, show that P x Q is in the 
^-direction. 
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Figure 1.19 
Law of Sines 



1.3.6 Prove that (A x B) • (A x B) = A 2 B 2 — (A • B) 2 . Write the identity 
appropriately and describe it in geometric language. Make a plot for a 
typical case using graphical software. 

1.3.7 Using the vectors 

P = x cos & + ysin0, 

Q = x cos cp — y sin <p, 

R = x cos <p + y sin <p, 
prove the familiar trigonometric identities 

sin (0 + <p) = sin 0 cos <p + cos 6 sin <p, 
cos (0 + <p) = cos 0 cos<p — sin 9 sin <p. 

1.3.8 If four vectors a, b, c, and d all lie in the same plane, show that 

(a x b) x (c x d) = 0. 

If graphical software is available, plot all vectors for a specific numerical 
case. 

Hint. Consider the directions of the cross product vectors. 

1.3.9 Derive the law of sines (Fig. 1.19): 

sin a sin [i sin y 

W = IbT = Icp 

1.3.10 A proton of mass m, charge +e, and (asymptotic) momentum p = 
mv is incident on a nucleus of charge +Ze at an impact parameter b. 
Determine its distance of closest approach. 
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Hint. Consider only the Coulomb repulsion and classical mechanics, 
not the strong interaction and quantum mechanics. 

1.3.11 Expand a vector x in components parallel to three linearly independent 
vectors a, b, c. 

ANS. (a x b • c)x = (x x b • c)a + (a x x • c)b + (a x b ■ x)c. 

1.3.12 Let F be a force vector drawn from the coordinate vector r. If r' goes 
from the origin to another point on the line through the point of r with 
tangent vector given by the force, show that the torque r' x F = r x F— 
that is, the torque about the origin due to the force stays the same. 

1.3.13 A car drives in a horizontal circular track of radius R (to its center of 
mass). Find the speed at which it will overturn, if h is the height of its 
center of mass and d the distance between its left and right wheels. 
Hint. Find the speed at which there is no vertical force on the inner 
wheels. (The mass of the car drops out.) 

1.3.14 A force F = (3, 2, 4) acts at the point (1, 4, 2). Find the torque about 
the origin. Plot the vectors using graphical software. 

1.3.15 Generalize the cross product to n-dimensional space (n — 2, 4, 5,...) 
and give a geometrical interpretation of your construction. Give realis¬ 
tic examples in four- and higher dimensional spaces. 

1.3.16 A jet plane flies due south over the north pole with a constant speed 
of 500 mph. Determine the angle between a plumb line hanging freely 
in the plane and the radius vector from the center of the earth to the 
plane above the north pole. 

Hint. Assume that the earth’s angular velocity is 2 it radians in 24 hr, 
which is a good approximation. Why? 


1.4 Triple Scalar Product and Triple Vector Product 


Sections 1.2 and 1.3 discussed the two types of vector multiplication. However, 
there are combinations of three vectors, A (B X C) and A X (B x C), that 
occur with sufficient frequency in mechanics, electrodynamics, and analytic 
geometry to deserve further attention. The combination 

A • (B x C) (1.45) 

is known as the triple scalar product. B x C yields a vector that, dotted into 
A, gives a scalar. We note that (A-B)xC represents a scalar crossed into 
a vector, an operation that is not defined. Hence, if we agree to exclude this 
undefined interpretation, the parentheses may be omitted and the triple scalar 
product written as A • B x C. 


Triple Scalar Product 
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Figure 1.20 

Parallelepiped 
Representation of 
Triple Scalar 
Product 



Using Eq. (1.38) for the cross product and Eq. (1.11) for the dot product, 
we obtain 

A ■ B x C = A x (B y C z — B z C y ) + A y (B z C x — B X C Z ) + A z (B x G y — B y C x ) 

= B CxA=C AxB = —A CxB 


= -C B x A = -B A x C. (1.46) 

The high degree of symmetry present in the component expansion should be 
noted. Every term contains the factors A x , B h and C) t . If i, j, and k are in cyclic 
order {x, y, z), the sign is positive. If the order is anticyclic, the sign is negative. 
Furthermore, the dot and the cross may be interchanged: 

ABxC = AxBC. (1.47) 


A convenient representation of the component expansion of Eq. (1.46) is pro¬ 
vided by the determinant 


A B x C = 


A X Ay 

Bj- By 

C X Cy 


A z 

B z 

C z 


(1.48) 


which follows from Eq. (1.38) by dotting B x C into A. The rules for interchang¬ 
ing rows and columns of a determinant 5 provide an immediate verification of 
the permutations listed in Eq. (1.46), whereas the symmetry of A, B, and C in 
the determinant form suggests the relation given in Eq. (1.46). The triple prod¬ 
ucts discussed in Section 1.3, which showed that A x B was perpendicular to 
both A and B, were special cases of the general result [Eq. (1.46)]. 

The triple scalar product has a direct geometrical interpretation in 
which the three vectors A, B, and C are interpreted as defining a paral¬ 
lelepiped (Fig. 1.20): 


|B x C| = BC sin 0 = area of parallelogram base. (1-49) 


5 See Section 3.1 for a detailed discussion of the properties of determinants. 
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The direction, of course, is normal to the base. Dotting A into this means 
multiplying the base area by the projection of A onto the normal, or base 
times height. Therefore, 

A B x C = volume of parallelepiped defined by A, B, and C. (1.50) 

Note that A BxC may sometimes be negative. This is not a problem, and its 
proper interpretation is provided in Chapter 2. 


EXAMPLE 1.4.1 


A Parallelepiped For 

A = x + 2y - z, 


B = y+2, 


C = x - y, 


A B x C = 


1 2 

0 1 

1 -1 


-1 

1 

0 


= 4. 


This is the volume of the parallelepiped defined by A, B, and C. ■ 


Recall that we already encountered a triple scalar product, namely the 
distance d ~ (r 3 — ri) • (vi x v?) between two straight lines in Example 1.3.1. 


Triple Vector Product 


The second triple product of interest is A x (B x C), which is a vector. Here, the 
parentheses must be retained, as is seen from a special case (xxx)xy=0 , 
whereas xx(xxy) = xxz = — y. Let us start with an example that illustrates 
a key property of the triple product. 


EXAMPLE 1.4.2 


A Triple Vector Product By using the three vectors given in Example 1.4.1, 
we obtain 


B x C = 


x y z 

0 1 1 

1 -1 0 


=x+y- z 


and 


A x (B x C) = 


x y z 

1 2 -1 

1 1 -1 


-X - z = -(y + z) - (x - y). 


By rewriting the result in the last line as a linear combination of B and C, we 
notice that, taking a geometric approach, the triple product vector is perpen¬ 
dicular to A and to B x C. The plane spanned by B and C is perpendicular to 
B x C, so the triple product lies in this plane (Fig. 1.21): 

A x (B x C) = vB + vC, (1.51) 

where u and v are numbers. Multiplying Eq. (1.51) by A gives zero for the left- 
hand side so that uA • B + vA ■ C = 0. Hence, u = wA • C and v = — wA • B for 
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Figure 1.21 

B and C Are in the 
,n/-Plane. B x C Is 
Perpendicular to the 
.n/-Plane and Is 
Shown Here Along 
the 2 -Axis. Then 
A x (B x C) Is 
Perpendicular to the 
2 -Axis and 
Therefore Is Back in 
the xy -Plane 



a suitable number w . Substituting these values into Eq. (1.50) gives 

A x (B x C) = w[B (A • C) — C (A • B)]. (1.52) 

Equation (1.51), with w = 1, which we now prove, is known as the BAC- 
CAB rule. Since Eq. (1.52) is linear in A, B, and C, w is independent of these 
magnitudes. That is, we only need to show that w = 1 for emit vectors A, B, 

A A A A A /V /V 

C. Let us denote B • C = cos a, C ■ A = cos /}, A • B = cos y, and square Eq. 
(1.52) to obtain 


/V A /VO /VO /V /VO /V /v /vo 

[A x (B x C)] 2 = A 2 (B x C) 2 - [A ■ (B x C)] 2 
= 1 - cos 2 ff -[A-(Bx C)] 2 

O/v/vo /v /v o /v/v/v/v/v/v 

= w 2 [(A ■ C) 2 + (A • B) 2 - 2A • B A • CB • C] 

= w 2 (cos 2 ft + cos 2 y — 2 cos a cos ft cosy), (1.53) 

using (A x B) 2 = A 2 B 2 — (A ■ B) 2 repeatedly. Consequently, the (squared) 
volume spanned by A, B, C that occurs in Eq. (1.53) can be written as 

[A ■ (B x C)] 2 = 1 — cos 2 a — w 2 (cos 2 ft + cos 2 y — 2 cos a cos ft cos y). 

Here, we must have w 2 = 1 because this volume is symmetric in a, ft, y. 
That is, w — ±1 and is independent of A, B, C. Again using the special case 
x x (x x y) = —y in Eq. (1.51) finally gives w = 1. 
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SUMMARY 


An alternate and easier algebraic derivation using the Levi-Civita Sijk of 
Chapter 2 is the topic of Exercise 2.9.8. 

Note that because vectors are independent of the coordinates, a vector 
equation is independent of the particular coordinate system. The coordinate 
system only determines the components. If the vector equation can be es¬ 
tablished in Cartesian coordinates, it is established and valid in any of the 
coordinate systems, as will be shown in Chapter 2. Thus, Eq. (1.52) may be 
verified by a direct though not very elegant method of expanding into Cartesian 
components (see Exercise 1.4.1). 

Other, more complicated, products may be simplified by using these forms 
of the triple scalar and triple vector products. 


We have developed the geometric meaning of the triple scalar product as a 
volume spanned by three vectors and exhibited its component form that is 
directly related to a determinant whose entries are the Cartesian components 
of the vectors. 

The main property of the triple vector product is its decomposition ex¬ 
pressed in the BAC-CAB rule. It plays a role in electrodynamics, a vector field 
theory in which cross products abound. 

EXERCISES 

1.4.1 Verify the expansion of the triple vector product 

A x (B x C) = B(A • C) - C(A • B) 
by direct expansion in Cartesian coordinates. 

1.4.2 Show that the first step in Eq. (1.43), 

(A x B) • (A x B) = A 2 B 2 - (A • B) 2 , 
is consistent with the BAC-CAB rule for a triple vector product. 

1.4.3 The orbital angular momentum L of a particle is given by L = r x 
p = mr x v, where p is the linear momentum. With linear and angular 
velocity related by v = uj x r, show that 

L = mr 2 [u: — r(r ■ u;)], 

where r is a unit vector in the r direction. For r ■ uj — 0, this reduces to 
L — I uj, with the moment of inertia I given by mr 2 . 

The kinetic energy of a single particle is given by T = \mv 2 . For rota¬ 
tional motion this becomes |m( uj x r) 2 . Show that 

T — -m[r 2 a> 2 — (r • u>) 2 ]. 

Li 

For r • uj = 0, this reduces to T = lor, with the moment of inertia I 
given by mr 2 . 
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1.4.5 Show that 


a x (b x c) + b x (c x a) + c x (a x b) = 0. 6 

1.4.6 A vector A is decomposed into a radial vector A r and a tangential vector 
A t . If r is a unit vector in the radial direction, show that 

(a) A r — r(A ■ r) and 

(b) A t = -r x (r x A). 

1.4.7 Prove that a necessary and sufficient condition for the three (nonvan¬ 
ishing) vectors A, B, and C to be coplanar is the vanishing of the triple 
scalar product 


A • B x C = 0. 


1.4.8 Vector D is a linear combination of three noncoplanar (and nonorthog- 
onal) vectors: 

D = aA + bB + cC. 

Show that the coefficients are given by a ratio of triple scalar products, 

D B x C 


a = 


A B x C’ 


and so on. 


If symbolic software is available, evaluate numerically the triple scalar 
products and coefficients for a typical case. 

1.4.9 Show that 


(A x B) ■ (C x D) = (A ■ C)(B • D) - (A • D)(B ■ C). 


1.4.10 Show that 


(A x B) x (C x D) = (A • B x D)C - (A • B x C)D. 


1.4.11 Given 


b x c , c x a 

a = — r -, b = 


a x b 


c = 


a bxc a bxc a b x c’ 

and a'bxc/0, show that 

(a) x' ■ y' = 0 (if x ^ y) and x' ■ y 7 = 1 (if x = y), for (x, y = a, b, c), 

(b) a' b' x c' = (a b x c) _1 , 

(c) a = 

1.4.12 If x' ■ y = 0 if x ^ y and x' ■ y 7 = 1 if x = y, for (x, y = a, b, c), prove 
that 


, bxc 

a = —r- 

a b x c 

(This is the converse of Problem 1.4.11.) 


6 This is Jacobi’s identity for vector products. 
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1.4.13 Show that any vector V may be expressed in terms of the reciprocal 
vectors a', b', c' (of Problem 1.4.11) by 

V = (V ■ a) a' + (V ■ b)b' + (V ■ c)c'. 


1.4.14 An electric charge q\ moving with velocity Vi produces a magnetic 
induction B given by 


MO V! X r 

B = — qi — 5 — (SI units), 

47T r 2 

where r points from q\ to the point at which B is measured (Biot and 
Savart’s law). 

(a) Show that the magnetic force on a second charge < 72 , velocity V 2 , is 
given by the triple vector product 


(b) 

(c) 


F 2 = 


Mo Q 1 Q 2 


v 2 X (Vi X r). 


47r r 2 

Write out the corresponding magnetic force Fi that q-i exerts on q\. 
Define your unit radial vector. How do Fi and F 2 compare? 
Calculate Fi and F 2 for the case of q\ and qo moving along parallel 
trajectories side by side. 


ANS. 

(b) Fj 

(c) Fi 


Mo Q 1 Q 2 
An r 2 


V! x (v 2 x r). 


An r 2 


'-v r — —F 2 . 


0 
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Partial Derivatives 


In this section, we deal with derivatives of functions of several variables that 
will lead us to the concept of directional derivative or gradient operator, which 
is of central importance in mechanics, electrodynamics, and engineering. 

We can view a function z = <p(x, y) of two variables geometrically as a 
surface over the .ry-plane in three-dimensional Euclidean space because for 
each point (x, y) we find the z value from cp. For a fixed value y then, z = 
<p(x, y) = f(x) is a function of x only, viz. a curve on the intersection of the 
surface with the xs-plane going through y. The slope of this curve, 


df 

dx 


d(p(x, y) 


= lim 

dx h-> 0 h 

is the partial derivative of cp with respect to x defined with the understand¬ 
ing that the other variable y is held fixed. It is useful for drawing tangents 
and locating a maximum or minimum on the curve where the slope is zero. 
The partial derivative dcp/dy is similarly defined holding x fixed (i.e., it is 
the slope of the surface in the yy-direction), and so on for the higher partial 
derivatives. 


<p(x + h,y)~ <p(x, y) 


(1.54) 
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EXAMPLE 1.5.1 


EXAMPLE 1.5.2 


Error Estimate Error estimates usually involve many partial derivatives. 
Let us calculate the moment of inertia of a rectangular slab of metal of length 
a = 10 ± 1 cm, width b = 15 ± 2 cm, and height c — 5 ± 1 cm about an axis 
through its center of gravity and perpendicular to the area ab and estimate the 
error. The uniform density is p = 5 ± 0.1 g/cm 3 . The moment of inertia is given 
by 



+ y 2 )dx = pc 





= ^2i b \; 


'b 
a| - 


pabc 2 . , 2 ^ 


12 


-(cr + fr) 


= -5 6 (4 + 9) g cm 2 = 10.15625 x 10" 3 kg nr 2 , 
2 


(1.55) 


where dr = cdxdy has been used. 

The corresponding error in 7 derives from the errors in all variables, each 
being weighted by the corresponding partial derivative, 




< a ^+i s 


„ , 9/ 


v , 3 1 

^r + [ Vc 


(Ac) 2 , 


where Ax is the error in the variable x, that is, Aa = 1 cm, etc. The partial 
derivatives 


9/ abc „ o « 97 pbc , « 

- = — (a 2 + i> 2 ), - = _(V + ir), 

^ = (^(“ 2 + 36 2 ), ^ = ^(« 2 + 6 2 ) 

db 12 9c 12 


(1.56) 


are obtained from Eq. (1.55). Substituting the numerical values of all parame¬ 
ters, we get 


o t or 

— A p = 0.203125 x 10“ 3 kg m 2 , —Aa= 1.640625 x 10~ 3 kg m 2 , 

dp da 

or or 

— A b = 3.2291667 x 10" 3 kg m 2 , — Ac = 2.03125 x 10" 3 kg m 2 . 

db dc 


Squaring and summing up, we find A7 = 4.1577 x 10 -3 kg m 2 . This error of 
more than 40% of the value I is much higher than the largest error Ac ~ 20% 
of the variables on which I depends and shows how errors in several variables 
can add up. Thus, all decimals except the first one can be dropped safely. ■ 


Partials of a Plane Let us now take a plane F(r) = n • r — d — 0 that cuts 
the coordinate axes at x = 1, y — 2, z — 3 so that n x — d, 2 n y = d, 3 n z — d. 
Because the normal n 2 = 1, we have the constraint d 2 (l + | + i) = 1 so that 
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d = 6/7. Hence, the partial derivatives 

dF dF dF 

— = n x = 6/7, — = n y = 3/7, ~^=n z = 2/7 

are the components of a vector n (the normal) for our plane 6x + 3y + 2z = 6. 
This suggests the partial derivatives of any function F are a vector. ■ 


To provide more motivation for the vector nature of the partial derivatives, we 
now introduce the total variation of a function F(x, y), 

3 F 3 F 

dF =— dx-\ - dy. (1-57) 

ox oy 

It consists of independent variations in the x- and //-directions. We write dF as 
a sum of two increments, one purely in the x- and the other in the //-direction, 


dF(pc, y) = F(x + dx, y + dy) — F(x, y) = [F(x + dx, y + dy) — F(x, y + dy)] 

dF dF 

+ [F(x, y+dy)- F(x, y)] = —dx+ —dy, 

dx dy 

by adding and subtracting F(x, y + dy). The mean value theorem (i.e., conti¬ 
nuity of F) tells us that here dF/dx, dF /dy are evaluated at some point §, ii 
between x and x + dx, y and y + dy, respectively. As dx —> 0 and dy —*■ 0, 
£ -* x and i] y. This result generalizes to three and higher dimensions. For 
example, for a function q> of three variables, 


dcpipc, y, z) = [<p(x + dx, y + dy,z+ dz) — <p(x, y+dy,z+ dz)] 

+ [<p(x, y + dy,z + dz) - <p(x, y,z+ dz)] 

+ [<p{x, y,z+ dz) - <j o(x, y, s)] 

= 3 J? dx+ a ± dy+ 3 Jt ds , (1.58) 

dx dy dz 


Note that if F is a scalar function, dF is also a scalar and the form of Eq. (1.57) 
suggests an interpretation as a scalar product of the coordinate displacement 
vector dr = (d.r, dy) with the partial derivatives of F; the same holds for dcp 
in three dimensions. These observations pave the way for the gradient in the 
next section. 

As an application of the total variation, we consider the slope of an im¬ 
plicitly defined curve F(x, y) — 0, a general theorem that we postponed in 
Section 1.3. Because also dF = 0 on the curve, we find the slope of the curve 


j 3 F 

dy _ 3x 
dx 

Sy 


(1.59) 


from Eq. (1.57). Compare this result with y/x for the slope of a curve defined 
in terms of two functions x(t), y(t) of time t in Section 1.2. 

Often, we are confronted with more difficult problems of finding a slope 
given some constraint. A case in point is the next example. 
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EXAMPLE 1.5.3 


Extremum under a Constraint Find the points of shortest (or longest) 
distance from the origin on the curve G(x, y) = x 2 +xy+y 2 — 1 = 0. 

From analytic geometry we know that the points on such a quadratic form 
with center at the origin (there are no terms linear in x or y that would shift the 
center) represent a conic section. But which kind? To answer this question, 
note that ( x + yf = x 2 + 2xy + y 2 > 0 implies that xy > —(x 2 + y 2 ')/2 so 
that the quadratic form is positive definite, that is, G(x, y) + 1 > 0, and G 
must therefore be an ellipse (Fig. 1.22). Hence, our problem is equivalent to 
finding the orientation of its principal axes (see Section 3.5 for the alternative 
matrix diagonalization method). The square of the distance from the origin is 
defined by the function F(x, y) = x 2 + y 2 , subject to the constraint that the 
point (x, y) lie on the ellipse defined by G. The constraint G defines y = y(x). 
Therefore, we look for the solutions of 


dF(x, y(x)~) 
dx 


2x + 2y—. 
y dx 


Differentiating G, we find 


2 x + y 
2 y + x 


from 2 x + y+ xy' + 2 yy' = 0 , 


Figure 1.22 

The Ellipse 
x 2 + xy+y 2 = 1 



















1.5 Gradient, V 


39 


which we substitute into our min/max condition dF/dx = 0. This yields 
x(2y + x) — y(2x + y), or y = ±x. 

Substituting x = y into G gives the solutions x = ±1/V3, while x = —y 
yields the points x = ±1 on the ellipse. Substituting x = 1 into G gives y — 0 
and y = — 1, while x = —l yields y = 0 and y = 1. Although the points 
y) = ( 1 , 0), (— 1 , 0) lie on the ellipse, their distance (= 1 ) from the origin 
is neither shortest nor longest. However, the points (1, —1), (—1, 1) have the 
longest distance (= ~JT) and define the line x + y = 0 through the origin (at 
135) as a principal axis. The points (1/V3, 1/V3), (— 1/V3, —1/V3) define the 
line at 45° through the origin as the second principal axis that is orthogonal to 
the first axis. 

It is also instructive to apply the slope formula (1.59) at the intersection 
points of the principal axes and the ellipse, that is, (4g, -0), (1, —1). The partial 
derivatives there are given by G x = = 2a; + y — -J= = >/3 and 2—1 = 1, 

respectively, G y =^=2y+x=-^ = V3 and -2 + 1 = — 1 , so that 
the slopes become — G x /G y = — = —1 equal to that of the principal 

axis x + y — 0 , and — 1 /(— 1 ) = 1 equal to that of the other principal axis 
x — y — 0 . ■ 

Although this problem was straightforward to solve, there is the more ele¬ 
gant Lagrange multiplier method for finding a maximum or minimum of a 
function F(x, y) subject to a constraint G(x, y) — 0. 

Introducing a Lagrange multiplier X helps us avoid the direct (and often 
messy algebraic) solution for x and y as follows. Because we look for the 
solution of 


dF 3 F 

dF = —dx H- dy = 0, 

ox oy 


3 G 3 G 

dG = —dx H- dy = 0, 

ox oy 


(1.60) 


we can solve for the slope dy/dx from one equation and substitute that solution 
into the other one. Equivalently, we use the function F + XG of three variables 
x, y, X, and solve 


d(F + AG) 


/3F +x 3G5 
\dx dx J 


dx + 


dF dG , 

— + X— )dy + 
dy dy 


30 F + XG) 
dX 


dX = 0 


by choosing X to satisfy ^ + /.^ = 0, for example, and then eliminating 
the last term by the constraint G — 0 (note that F does not depend on X ) so 
that jj. + X A) = 0 follows. Including the constraint, we now have three equa¬ 
tions for three unknowns x, y, X, where the slope X is not usually needed. 


D 


EXAMPLE 1.5.4 


Lagrange Multiplier Method Let us illustrate the method by solving 
Example 1.5.3 again, this time using the Lagrange multiplier method. The x 
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and y partial derivative equations of the Lagrange multiplier method are given 
by ' 

3 F d G 

- —b A— = 2x + X(2x +y) = 0, 

dX OX 

dF dG m „ 

— + 1— = 2y+X(2y + x) = 0. 
dy 3 y 

We find for the ratio £ = y/x — —2(A. + 1)//. and f = —A/2(l + A), that is, 
§ = 1 /§, or £ = ±1, so that the principal axes are along the lines x + y = 0 
and x — y — 0 through the origin. Substituting these solutions into the conic 
section G yields x = L/V3 = y and x — l = —y, respectively. Contrast this 
simple, yet sophisticated approach with our previous lengthy solution. ■ 


Biographical Data 

Lagrange, Joseph Louis comte de. Lagrange, a French mathematician 
and physicist, was bom in Torino to a wealthy French-Italian family in 1736 
and died in Paris in 1813. While in school, an essay on calculus by the English 
astronomer Halley sparked his enthusiasm for mathematics. In 1755, he 
became a professor in Torino. In 1766, he succeeded L. Euler (who moved to 
St. Petersburg to serve Catherine the Great) as director of the mathematics- 
physics section of the Pmssian Academy of Sciences in Berlin. In 1786, he 
left Berlin for Paris after the death of king Frederick the Great. He was the 
founder of analytical mechanics. His famous book, Mecanique Analytique, 
contains not a single geometric figure. 


Gradient as a Vector Operator 


The total variation dF(x, y) in Eq. (1.57) looks like a scalar product of the 
incremental length vector dr = (dx, dy) with a vector (, ||) of partial deriva¬ 
tives in two dimensions, that is, the change of F depends on the direction in 
which we go. For example, F could be a wave function in quantum mechanics 
or describe a temperature distribution in space. When we are at the peak value, 
the height will fall off at different rates in different directions, just like a ski 
slope: One side might be for beginners, whereas another has only expert mns. 
When we generalize this to a function <p(x, y, z) of three variables, we obtain 
Eq. (1.58), 


, d<P , . d(p , , d<P , 

dip = — dx-\ - dy-\ - dz, 

dx dy dz 


(1.61) 


for the total change in the scalar function cp consisting of additive contribu¬ 
tions of each coordinate change corresponding to a change in position 


dr = x dx + y dy + z dz, 


(1.62) 


the increment of length dr. Algebraically, dip in Eq. (1.58) is a scalar product of 
the change in position dr and the directional change of <p. Now we are ready 
to recognize the three-dimensional partial derivative as a vector, which leads 
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us to the concept of gradient. A convenient notation is 


— — iVi- + *- + S- 

dx’ dy’ dz) X dx y dy 2 dz’ 

(1.63) 

„ *d(p d(p d(p 

= x— +y— +z —, 
dx dy dz 

(1.64) 


so that V (del) is a vector that differentiates (scalar) functions. As such, it 
is a vector operator. All the relations for V can be derived from the hybrid 
nature of del in terms of both the partial derivatives and its vector nature. 

The gradient of a scalar is extremely important in physics and engineering 
in expressing the relation between a force field and a potential field 

force F = — V( potential V), (1.65) 

which holds for both gravitational and electrostatic fields, among others. Note 
that the minus sign in Eq. (1.65) results in water flowing downhill rather than 
uphill. If a force can be described as in Eq. (1.65) by a single function V(r) 
everywhere, we call the scalar function V its potential. Because the force 
is the directional derivative of the potential, we can find the potential, if it 
exists, by integrating the force along a suitable path. Because the total variation 
(I V — W dr — —F • dr is the work done against the force along the path dr, 
we recognize the physical meaning of the potential (difference) as work and 
energy. Moreover, in a sum of path increments the intermediate points cancel, 

[V(r + dri + dr 2 ) - V(r+ dri)] + [V(r + dri) - V(r)] 

= V(r + dr 2 + dr!) — V (r), 

so that the integrated work along some path from an initial point r, to a final 
point r is given by the potential difference V(r) — V (r,) at the end points of 
the path. Therefore, such forces are especially simple and well behaved: They 
are called conservative. When there is loss of energy due to friction along 
the path or some other dissipation, the work will depend on the path and such 
forces cannot be conservative: No potential exists. We discuss conservative 
forces in more detail in Section 1.12. 


EXAMPLE 1.5.5 


The Gradient of a Function of r Because we often deal with central 
forces in physics and engineering, we start with the gradient of the radial 
distance r = x l + y 2 + z 2 . From r as a function of x, y, z, 


dr d(x 2 + y 2 + s 2 ) 1/2 x x 

dx dx ( x 2 + y 2 + z 2 yi 2 r ’ 

etc. Now we can calculate the more general gradient of a spherically symmetric 
potential f(r) of a central force law so that 


_ ,, . „ 9 Af) „ df(r) „ 9/(r) 

V/(r) = x—-by—:-b z- 


dx 


dy 


dz 


( 1 . 66 ) 
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where /(r) depends on x through the dependence of r on x. Therefore 7 , 

3/(r) df(r ) dr 

dx dr dx' 

Therefore, 


Sf(r ) 

dx 

Permuting coordinates (a; —► y,y 
tives, we get 


df (r) x 
dr r' 


(1.67) 


z, z -> x) to obtain the y and z deriva- 


V/(r) = (xx + yy + zz) ld jt = = r^, (1.68) 

r dr r dr dr 


where r is a unit vector (r/r) in the positive radial direction. The gradient of 
a function of r is a vector in the (positive or negative) radial direction. ■ 


A Geometrical Interpretation 


Example 1.5.2 illustrates the geometric meaning of the gradient of a plane: It 
is its normal vector. This is a special case of the general geometric meaning of 
the gradient of an implicitly defined surface <p(r) = const. Consider P and Q 
to be two points on a surface ip(x, y, z) — C, a constant. If <p is a potential, the 
surface is an equipotential surface. These points are chosen so that Q is a 
distance dr from P. Then, moving from P to Q, the change in ip(pc, y, z), given 
by Eq. (1.58) that is now written in vector notation, must be 


d<p — (Vip) ■ dr — 0 (1.69) 

since we stay on the surface ip(pc, y, z) = C. This shows that is perpen¬ 
dicular to dr. Since dr may have any direction from P as long as it stays 
in the surface ip = const., the point Q being restricted to the surface but 
having arbitrary direction, V<p is seen as normal to the surface <p = const. 
(Fig. 1.23). 

If we now permit dr to take us from one surface (p = C) to an adjacent 
surface <p = C 2 (Fig. 1.24), 


dip — Ci — C 2 = AC = (V<p) ■ dr. (1-70) 

For a given dip, |dr| is a minimum when it is chosen parallel to V ip (cos 9 = 1); 
for a given |dr|, the change in the scalar function <p is maximized by choosing 
dr parallel to Wip. This identifies V^asa vector having the direction of 
the maximum space rate of change of <p, an identification that will be useful 
in Chapter 2 when we consider non-Cartesian coordinate systems. 


7 This is a special case of the chain rule generalized to partial derivatives: 


3 f(r, 9, ip) = dfdr dfM dfdv 

dx dr dx dd dx dip dx’ 

where df/dO = df/dip = 0, df/dr -»• df/dr. 
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SUMMARY 


Figure 1.23 

The Length 
Increment dr is 
Required to Stay on 
the Surface ip — C 



Figure 1.24 
Gradient 



We have constructed the gradient operator as a vector of derivatives. The total 
variation of a function is the dot product of its gradient with the coordinate 
displacement vector. A conservative force is the (negative) gradient of a scalar 
called its potential. 

EXERCISES 

1.5.1 The dependence of the free fall acceleration g on geographical latitude 
</> at sea level is given by g = r/o(l +0.0053 sin 2 </>). What is the southward 
displacement near <j> = 30° that changes g by 1 part in 10 8 ? 

1.5.2 Given a vector r 12 = x(.xi — x 2 ) + y(jj\ — y£) + z( 2 i — s' 2 ), show that V in 2 
(gradient with respect to X\, jj\ , and of the magnitude ri 2 ) is a unit 
vector in the direction of ri 2 . Note that a central force and a potential 
may depend on r 12 . 
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1.5.3 If a vector function F depends on both space coordinates (x, y, z) and 
time t, show that 


dF = (dr 


3F 

V)F +—dt. 
at 


1.5.4 Show that V(m>) = v'Vu + »/V v, where u and v arc differentiable scalar 
functions of x, y, and z (product rule). 

(a) Show that a necessary and sufficient condition that u(x, y, z) and 
v(x, y, z) are related by some function f(u, v) = 0 is that (Vm) x 
(V v) = 0. Describe this geometrically. If graphical software is avail¬ 
able, plot a typical case. 

(b) If u= u(pc, y) and v = v(x, y), show that the condition 
(V») x (V r) = 0 leads to the two-dimensional Jacobian 


u, v 
x, y 


dll 

dll. 

dx 

9y 

dv 

dv 

dx 

9 y 


3 u dv 
dx 3 y 


3 u dv 
3 y dx 


The functions u and v are assumed differentiable. 


1.6 Divergence, V 


In Section 1.5, V was defined as a vector operator. Now, paying careful at¬ 
tention to both its vector and its differential properties, we let it operate on a 
vector. First, as a vector we dot it into a second vector to obtain 


V-V = 


dVx dVy dVz 
dx 3 y 3 z ’ 


(1.71) 


known as the divergence of V, which we expect to be a scalar. 


EXAMPLE 1.6.1 


Divergence of a Central Force Field From Eq. (1.71) we obtain for the 
coordinate vector with radial outward flow 


_ dx dy 3 z 
V-r= — + — +— 
dx dy 3 z 


(1.72) 


Because the gravitational (or electric) force of a mass (or charge) at the 
origin is proportional to r with a radial 1 /r 3 dependence, we also consider the 
more general and important case of the divergence of a central force field 


9 9 9 

V • r/(r) = — [xf(r)] + ~—[yf(r)] + —[zf(r)\ 
dx dy 3 z 

df df 3 f df 

= /00V ■ r + xf- + yf- + zf - = 3 f(r) + -j-r ■ Vr 
dx dy 3 z dr 


= 3 m+ 


x 2 df 
r dr 


y 2 df z 2 df df 

- 1 '- T~ ~ 3/(r) + r—, 

r dr r dr dr 


(1.73) 
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using the product and chain rules of differentiation in conjunction with 
Example 1.5.5 and Eq. (1.71). In particular, if f(jr) — r n ~ l , 

V ■ rr n ~ l = V • (r r n ) = 3r n ~ l + (n - 1) r™” 1 = (n + 2) r n ~ l . (1.74) 


This divergence vanishes for n — —2, except at r = 0 (where r/r 2 is singular). 
This is relevant for the Coulomb potential 


V(r) = A 0 = - 


Q 

Aneo r 


with the electric field 


E = -VV = 


qr 

Ane^r 2 ' 


Using Eq. (1.74) we obtain the divergence V • E = 0 (except at r = 0, where 
the derivatives are undefined). ■ 


A Physical Interpretation 


To develop an understanding of the physical significance of the divergence, 
consider V • (pv), with v(.'r, y, z), the velocity of a compressible fluid, and 
p(x, y, z), its density at point (x, y, z). If we consider a small volume dxdydz 
(Fig. 1.25), the fluid flowing into this volume per unit time (positive ^-direction) 
through the face EFGH is (rate of flow in)/, 7 .r r // = pv x \ x=Q dydz. The compo¬ 
nents of the flow pv y and pv z tangential to this face contribute nothing to the 
flow through this face. The rate of flow out (still positive .r-direction) through 
face ABCD is pv x \ x=dx dydz. To compare these flows and to find the net flow 
out, we add the change of pv x in the .r-direction for an increment dx that 


Figure 1.25 

Differential 
Rectangular 
Parallelepiped (in 
the First or Positive 
Octant) 
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is given by its partial derivative (i.e., expand this last result in a Maclaurin 
series). 8 This yields 


(rate of flow ouQabcd 


pvx\oc=dx dydz 
3 

pv x + — (pv x )dx 
ox 


dydz. 


37=0 


Here, the derivative term is a first correction term allowing for the possibil¬ 
ity of nonuniform density or velocity or both. 9 The zero-order term pv x \ x=(i 
(corresponding to uniform flow) cancels out: 


Net rate of flow outL = —( pv x ~) dxdydz. 

dx 

Equivalently, we can arrive at this result by 


lim 

A #-^0 


pv x {Ax, 0, 0) - pv x (0, 0, 0) 
Ax 


3 [ pv x Or, y, z)\ 


dx 


(0,0,0) 


Now the .r-axis is not entitled to any preferred treatment. The preceding result 
for the two faces perpendicular to the .r-axis must hold for the two faces per¬ 
pendicular to the y-axis, with x replaced by y and the corresponding changes 
for y and z: y —»■ z, z -* x. This is a cyclic permutation of the coordinates. A 
further cyclic permutation yields the result for the remaining two faces of our 
parallelepiped. Adding the net rate of flow out for all three pairs of surfaces of 
our volume element, we have 


Net flow out 
(per unit time) 


' 3 

dx 


(pV x ~) + 



= V ■ ( px) dxdydz. 



dxdydz 


(1.75) 


Therefore, the net flow of our compressible fluid out of the volume element 
dxdydz per unit volume per unit time is V ■ (pv). Hence the name divergence. 
A direct application is in the continuity equation 

% + V • (pv) = 0, (1.76) 

dt 

which states that a net flow out of the volume results in a decreased density 
inside the volume. Note that in Eq. (1.76), p is considered to be a possible 
function of time as well as of space: p(x, y, z, t). The divergence appears in a 
wide variety of physical problems, ranging from a probability current density 
in quantum mechanics to neutron leakage in a nuclear reactor. 


8 A Maclaurin expansion for a single variable is given by Eq. (5.75) in Section 5.6. Here, we have the 
increment x of Eq. (5.75) replaced by dx. We show a partial derivative with respect to x because 
pv x may also depend on y and z. 

B Strictly speaking, pv x is averaged over face EFGH and the expression pv x + (3/3 x^pvx^dx is 
similarly averaged over face ABCD. Using an arbitrarily small differential volume, we find that the 
averages reduce to the values employed here. 
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The combination V • (/V), in which / is a scalar function and V a vector 
function, may be written as 

V • (/V) = 


which is what we would expect for the derivative of a product. Notice that V 
as a differential operator differentiates both / and V; as a vector it is dotted 
into V (in each term). 


^(fV X ) + fVy) + A (/V a ) 
dx 3 y dz 

df dV x 3 f dV y 3/ 3V* 

— Vx + f— + — V„+/^+ — V s + /— 
3a; 3a; dy 3 y dz dz 


(V/)-V + /V-V, 


(1.77) 


SUMMARY 


The divergence of a vector field is constructed as the dot product of the gradient 
with the vector field, and it locally measures its spatial outflow. In this sense, 
the continuity equation captures the essence of the divergence: the temporal 
change of the density balances the spatial outflow of the current density. 


EXERCISES 

1.6.1 For a particle moving in a circular orbit r = xr cos cat + y r sin cot, 

(a) evaluate r x r. 

(b) Show that i 1 + a> 2 r = 0. 

The radius r and the angular velocity o> are constant. 

A/VS', (a) zo>r 2 . Note : r = dr/dt, r = d z r/dt 2 . 

1.6.2 Show, by differentiating components, that 

(a) |(A-B)=f-B + A-f, 

(b) £(AxB)=§ xB + Axf, 

in the same way as the derivative of the product of two scalar 
functions. 


1.7 Curl, Vx 


Another possible application of the vector V is to cross it into a vector field 
called its curl, which we discuss in this section along with its physical inter¬ 
pretation and applications. We obtain 


V xV=x| Ay 

,3 V dz 


3 3 

yI tVx- ^-v 2 
dz dx 


3 3 

— V v -V 3 

dx v dy 


_ 3 _ 

dx 


y 

a_ 

ay 


a_ 

dz 


v~ v„ 


(1.78) 
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EXAMPLE 1.7.1 


EXAMPLE 1.7.2 


which is called the curl of V. In expanding this determinant we must consider 
the derivative nature of V. Specifically, V x V is meaningless unless it acts on 
a function or a vector. Then it is certainly not equal, in general, to — V x V. 10 
In the case of Eq. (1.78), the determinant must be expanded from the top 
down so that we get the derivatives as shown in the middle of Eq. (1.78). If V 
is crossed into the product of a scalar and a vector, we can show 


V x <JV)\ X 


(fV Z ) ~ ^ (jVy) 

3 y 3 z 


dV z df 

f— + —v z 

3 V 3 y 


3 


J dz 3 z v 


/VxV| x +(V/)xV| x . 


(1.79) 


If we permute the coordinates x —»■ y, y -»• z, z -> x to pick up the y- 
component and then permute them a second time to pick up the ^component, 


V x (/V) = /V x V + (V/) x V, (1.80) 


which is the vector product analog of Eq. (1.77). Again, as a differential opera¬ 
tor, V differentiates both / and V. As a vector, it is crossed into V (in each term). 

Vector Potential of a Constant B Field From electrodynamics we know 
that V • B = 0, which has the general solution B = V x A, where A(r) is 
called the vector potential (of the magnetic induction) because V • (V x A) = 
(V X V)-A = 0 as a triple scalar product with two identical vectors. This last 
identity will not change if we add the gradient of some scalar function to the 
vector potential, which is therefore not unique. 

In our case, we want to show that a vector potential is A = i(B x r). 
Using the BAC-CAB rule in conjunction with Eq. (1.72), we find that 

2V x A = V x (B x r) = (V • r)B - (B ■ V)r = 3B - B = 2B, 

where we indicate by the ordering of the scalar product of the second term 
that the gradient still acts on the coordinate vector. ■ 

Curl of a Central Force As in Example 1.6.1, let us start with the curl of 
the coordinate vector 


V x r = 


x y 

_ 3 _ _ 3 _ 

dx 3 y 

x y 


z 



2 


(1.81) 


10 Note that for the quantum mechanical angular momentum operator, L = —i(r x V), we find 
that L x L = iL. See Sections 4.3 and 4.4 for more details. 
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Algebraically, this comes about because each Cartesian coordinate is indepen¬ 
dent of the other two. 

Now we are ready to calculate the curl of a central force V x r f(r), where 
we expect zero for the same reason. By Eq. (1.80), 

V x r f(f) = /(r)V x r + [V/(r)] x r. (1.82) 

Second, using V/(r) = r (df/dr) (Example 1.5.5), we obtain 

V x rf (r) = — r x r = 0. (1.83) 

dr 

This vector product vanishes because r = rr and rxr=0. 

This central force case is important in potential theory of classical mechan¬ 
ics and engineering (see Section 1.12). ■ 

To develop a better understanding of the physical significance of the curl, 
we consider the circulation of fluid around a differential loop in the xy-\Aane 
(Fig. 1.26). 

Although the circulation is technically given by a vector line integral 
f V ■ d\, we can set up the equivalent scalar integrals here. Let us take the 
circulation to be 


Circulationi234 = k V x (x, y) dX x + J V y {x, y) dX y 

+ V x (x, y) dl x + V y (x, y) d), y . (1.84) 

The numbers 1-4 refer to the numbered line segments in Fig. 1.26. In the first 
integral dX x = +dx but in the third integral dX t = —dx because the third line 
segment is traversed in the negative .r-direction. Similarly, dX y = +dy for the 


Figure 1.26 

Circulation Around 
a Differential Loop 
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second integral and —dy for the fourth. Next, the integrands are referred to the 
point (xq, 2 / 0 ) with a Taylor expansion, 11 taking into account the displacement 
of line segment 3 from 1 and 2 from 4. For our differential line segments, this 
leads to 


Circulationi234 = V x (xo, yo) dx - 


dV v ' 
V y (xo, yo) + dx 


dy 


+ 


dV x ' 
V x (xq, Vo) + — dy 
3 y . 


— ( 

dx dy) 


dxdy. 


(-dx) + V y (ar 0 , yo) (-dy) 

(1.85) 


Dividing by dxdy, we have 


Circulation per unit area = V x V| 2 . (1.86) 

This is an infinitesimal case of Stokes’s theorem in Section 1.11. The circula¬ 
tion 12 about our differential area in the .ry-plane is given by the ^-component of 
V x V. In principle, the curl VxVat(ro, yo) could be determined by inserting a 
(differential) paddle wheel into the moving fluid at point (xo, yo)- The rotation 
of the little paddle wheel would be a measure of the curl and its axis along the 
direction of V x V, which is perpendicular to the plane of circulation. 

In light of this connection of the curl with the concept of circulation, we now 
understand intuitively the vanishing curl of a central force in Example 1.7.2 
because r flows radially outward from the origin with no rotation, and any 
scalar f(r) will not affect this situation. When 

V x V = 0, (1.87) 


V is labeled irrotational. The most important physical examples of irrota- 
tional vectors are the gravitational and electrostatic forces. In each case, 


V = 



( 1 . 88 ) 


where C is a constant and r is the unit vector in the outward radial direction. 
For the gravitational case, we have C = given by Newton’s law of 

universal gravitation. If C = 9i<72/(4jt£o), we have Coulomb’s law of electro¬ 
statics (SI units). The force V given in Eq. (1.88) maybe shown to be irrotational 
by direct expansion into Cartesian components as we did in Example 1.7.2 
[Eq. (1.83)]. 

In Section 1.15 of Arfken and Weber’s Mathematical Methodsfor Physicists 
(5th ed.), it is shown that a vector field may be resolved into an irrotational 
part and a solenoidal part (subject to conditions at infinity). 


11 Vy(xo + dx, yo) = Vy{xo, yo) + H-■ The higher order terms will drop out in the 

limit as dx —> 0. 

12 In fluid dynamics, V x V is called the vorticity. 
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EXAMPLE 1.7.3 


SUMMARY 


For waves in an elastic medium, if the displacement u is irrotational, 
V x u = 0, plane waves (or spherical waves at large distances) become lon¬ 
gitudinal. If u is solenoidal, V ■ u = 0, then the waves become transverse. A 
seismic disturbance will produce a displacement that may be resolved into a 
solenoidal part and an irrotational part. The irrotational part yields the longi¬ 
tudinal P (primary) earthquake waves. The solenoidal part gives rise to the 
slower transverse S (secondary) waves. 

Using the gradient, divergence, curl, and the BAC-CAB rule, we may con¬ 
struct or verify a large number of useful vector identities. For verification, 
complete expansion into Cartesian components is always a possibility. Some¬ 
times if we use insight instead of routine shuffling of Cartesian components, 
the verification process can be shortened drastically. 

Remember that V is a vector operator, a hybrid object satisfying two sets 
of rules: vector rules and partial differentiation rules, including differentiation 
of a product. 

Gradient of a Dot Product Verify that 

V(A • B) = (B • V)A + (A ■ V)B + B x (V x A) + A x (V x B). (1.89) 

This particular example hinges on the recognition that V (A • B) is the type 
of term that appears in the BAC-CAB expansion of a triple vector product 
[Eq. (1.52)]. For instance, 

A x (V x B) = V(A ■ B) - (A ■ V)B, 

with the V differentiating only B, not A. From the commutativity of factors in 
a scalar product we may interchange A and B and write 

B x (V x A) = V(A B) - (B V)A, 

now with V differentiating only A, not B. Adding these two equations, we ob¬ 
tain V differentiating the product A ■ B and the identity [Eq. (1.89)]. This 
identity is used frequently in electromagnetic theory. Exercise 1.7.9 is an 
illustration. ■ 


The curl is constructed as the cross product of the gradient and a vector field, 
and it measures the local rotational flow or circulation of the vector field. 
When the curl of a force field is zero, then the force is labeled conservative 
and derives from the gradient of a scalar, its potential. In Chapter 6, we shall see 
that an analytic function of a complex variable describes a two-dimensional 
irrotational fluid flow. 

EXERCISES 

1.7.1 Show that u x v is solenoidal if u and v are each irrotational. Start by 
formulating the problem in terms of mathematical equations. 


1.7.2 If A is irrotational, show that A x r is solenoidal. 
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1.7.3 A rigid body is rotating with constant angular velocity u. Show that the 
linear velocity v is solenoidal. 

1.7.4 If a vector function f(.x', y, z) is not irrotational but the product of / and 
a scalar function g(x, y, z) is irrotational, show that 

f ■ V x f = 0. 


1.7.5 Verify the vector identity 

V x (A x B) = (B ■ V)A - (A • V)B - B(V ■ A) + A(V ■ B). 

Describe in words what causes the last two terms to appear in the 
identity beyond the BAC-CAB rule. If symbolic software is available, 
test the Cartesian components for a typical case, such as A = L, B = 
r/r 3 . 

1.7.6 As an alternative to the vector identity of Example 1.7.5, show that 

V(A • B) = (A x V) x B + (B x V) x A + A(V • B) + B(V • A). 

1.7.7 Verify the identity 

A x (V x A) = l V(A 2 ) - (A • V)A. 

£-1 

Test this identity for a typical vector held, such as A ~ r or r/r 3 . 

1.7.8 If A and B are constant vectors, show that 

V(A • B x r) = A x B. 

1.7.9 A distribution of electric currents creates a constant magnetic moment 
m. The force on m in an external magnetic induction B is given by 

F = V x (B x m). 

Show that 

F = V(m B). 

Note. Assuming no time dependence of the fields, Maxwell’s equations 
yield V x B = 0. Also, V • B = 0. 

1.7.10 An electric dipole of moment p is located at the origin. The dipole 
creates an electric potential at r given by 

, , . P r 

Find the electric field E = — Vf at r. 

1.7.11 The vector potential A of a magnetic dipole, dipole moment m, is given 
by A(r) = (/ro/47r)(m x r/r 3 ). Show that the magnetic induction B = 
V x A is given by 

B _ Mo 3r(r m) - m 
47T r 3 
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1.7.12 Classically, orbital angular momentum is given by L = r x p, where p 
is the linear momentum. To go from classical mechanics to quantum 
mechanics, replace p by the operator — iV (Section 14.6). Show that 
the quantum mechanical angular momentum operator has Cartesian 
components 

, ■( d d \ 

Xdz by) 

L <=- s (4 - 4) 

r ■( 9 9 \ 

L z = — 1 \ x - y — 

V 3 y dxj 

(in units of K). 

1.7.13 Using the angular momentum operators previously given, show that 
they satisfy commutation relations of the form 

[Lx, Ly] = L X Ly LyL X = iL z 

and, hence, 

L x L = iL. 

These commutation relations will be taken later as the defining rela¬ 
tions of an angular momentum operator—see Exercise 3.2.15 and the 
following one and Chapter 4. 

1.7.14 With the commutator bracket notation [ L x , L y \ = L,L y — L y L x , the 
angular momentum vector L satisfies [L x , L y \ — iL z , etc., orL xL = /L. 
If two other vectors a and b commute with each other and with L, that 
is, [a, b] = [a, L] = [b, L] = 0, show that 

[a L, b L] = i(a x b) • L. 

This vector version of the angular momentum commutation relations 
is an alternative to that given in Exercise 1.7.13. 

1.7.15 Prove V ■ (a x b) = b • (V x a) — a ■ (V x b). Explain in words why 
the identity is valid. 

Hint. Treat as a triple scalar product. 


1.8 Successive Applications of V 


We have now defined gradient, divergence, and curl to obtain vector, scalar, and 
vector quantities, respectively. Letting V operate on each of these quantities, 
we obtain 


(a) V • Vip (b) V x Vcp (c) VV • V 

(d) V • V x V (e) V x (V x V). 
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All five expressions involve second derivatives and all five appear in the second- 
order differential equations of mathematical physics, particularly in electro¬ 
magnetic theory. 

The first expression, V ■ V^>, the divergence of the gradient, is called the 
Laplacian of <p. We have 


„ „ / A 3 A 3 A 3 \ 
V-V^= x—+y—+ z— 

\ dx 3 y oz / 

3 2 <p 3 2 ^> 3 2 <p 
3;r 2 3 y 2 3« 2 





(1.90) 


When <p is the electrostatic potential, in a charge-free region we have 


V ■ V<p = 0, 


(1.91) 


which is Laplace’s equation of electrostatics. Often, the combination V • V is 
written V 2 , or A in the European literature. 


Biographical Data 

Laplace, Pierre Simon. Laplace, a French mathematician, physicist, and 
astronomer, was bom in Beaumont-en-Auge in 1749 and died in Paris in 1827. 
He developed perturbation theory for the solar system, published a monu¬ 
mental treatise Celestial Mechanics, and applied mathematics to artillery. 
He made contributions of fundamental importance to hydrodynamics, dif¬ 
ferential equations and probability, the propagation of sound, and surface 
tension in liquids. To Napoleon’s remark missing “God” in his treatise, he 
replied “I had no need for that hypothesis.” He generally disliked giving 
credit to others. 


EXAMPLE 1.8.1 


Laplacian of a Radial Function Calculate V ■ V a(r). Referring to Examples 
1.5.5 and 1.6.1, 


V • Vg(r) = V ■ f — = 


2 dg d 2 g 

dr r dr dr 2 ’ 

replacing /(r) in Example 1.6.1 by 1/r • dg/dr. If g(r') = r", this reduces to 


V- Vr” = n(n+l)r™- 2 . 


This vanishes for n— 0 [g(r) = constant] and for n = — 1; that is, g(r) = 1/r 
is a solution of Laplace’s equation, V 2 ,g(r) = 0. This is for r ^ 0. At the origin 
there is a singularity. ■ 


Expression (b) may be written as 


X 

f 

Z 


3 

d_ 

dx 

32/ 

dz 

dtp 

dtp 

dtp 

dx 

92/ 

~dz 


V x V<p = 
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By expanding the determinant, we obtain 

3 2 (p d 2 <p \ 


V x Vip = x 


dydz dzdy) 

/ d 2 <p 3 2 (p \ 


\dxdy dydx 


3 2 (p 


dzdx dxdz) 


v_\ 


= 0 , 


(1.92) 


assuming that the order of partial differentiation may be interchanged. This 
is true as long as these second partial derivatives of ip are continuous func¬ 
tions. Then, from Eq. (1.92), the curl of a gradient is identically zero. All gra¬ 
dients, therefore, are irrotational. Note that the zero in Eq. (1.92) comes as a 
mathematical identity, independent of any physics. The zero in Eq. (1.91) is a 
consequence of physics. 

Expression (d) is a triple scalar product that may be written as 

ill 
dx dy dz 


V-V XV = 


_a_ _a_ 

3 y dz 


(1.93) 


V. Vy V Z 


Again, assuming continuity so that the order of differentiation is immaterial, 
we obtain 


V • V x V = 0. (1.94) 

The divergence of a curl vanishes or all curls are solenoidal. 

One of the most important cases of a vanishing divergence of a vector is 

V ■ B = 0, (1.95) 

where B is the magnetic induction, and Eq. (1.95) appears as one of Maxwell’s 
equations. When a vector is solenoidal, it may be written as the curl of another 
vector known as its vector potential, B = V x A. This form solves one of 
the four vector equations that make up Maxwell’s field equations of electrody¬ 
namics. Because a vector field may be determined from its curl and divergence 
(Helmholtz’s theorem), solving Maxwell’s (often called Oersted’s) equation in¬ 
volving the curl of B determines A and thereby B. Similar considerations apply 
to the other pair of Maxwell’s equations involving the divergence and curl of 
E and make plausible the fact that there are precisely four vector equations as 
part of Maxwell’s equations. 

The two remaining expressions satisfy a relation 

V x (V x V) = V(V ■ V) - (V ■ V)V. (1.96) 

This decomposition of the Laplacian V ■ V into a longitudinal part (the gradi¬ 
ent) and a transverse part (the curl term) follows from Eq. (1.52), the BAC-CAB 
rule, which we rewrite so that C appears at the extreme right of each term. The 
term (V ■ V)V was not included in our list, but it appears in the Navier-Stokes’s 
equation and may be defined by Eq. (1.96). In words, this is the Laplacian (a 
scalar operator) acting on a vector, so it is a vector with three components in 
three-dimensional space. ■ 
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EXAMPLE 1.8.2 


Electromagnetic Wave Equations One important application of this vec¬ 
tor relation [Eq. (1.96)] is in the derivation of the electromagnetic wave equa¬ 
tion. In vacuum Maxwell’s equations become 


V B 


V E 


V x B 


V x E 


0 , 

0 , 

9E 1 9E 

£ oM°— = — — 
dt c l dt 

_9B 

~~dt’ 


(1.97a) 

(1.97b) 

(1.97c) 

(1.97d) 


where E is the electric field, B the magnetic induction, eo the electric permit¬ 
tivity, and /ii, the magnetic permeability (SI units), so that e 0 /r 0 = 1/c 2 , where 
c is the velocity of light. This relation has important consequences. Because 
So, /uo can be measured in any frame, the velocity of light is the same in any 
frame. 

Suppose we eliminate B from Eqs. (1.97c) and (1.97d). We may do this by 
taking the curl of both sides of Eq. (1.97d) and the time derivative of both sides 
of Eq. (1.97c). Since the space and time derivatives commute, 


9 

— V x B = V x 
dt 


9B 

~dt’ 


and we obtain 


V x (V x E) = 


1 9 2 E 

C 2 dt 2 ' 


Application of Eqs. (1.96) and (1.97b) yields 


1 9 2 E 

(V.V)E=^, (1.98) 

the electromagnetic vector wave equation. Again, if E is expressed in 
Cartesian coordinates, Eq. (1.98) separates into three scalar wave equations, 
each involving a scalar Laplacian. 

When external electric charge and current densities are kept as driving 
terms in Maxwell’s equations, similar wave equations are valid for the electric 
potential and the vector potential. To show this, we solve Eq. (1.97a) by writing 
B = V x A as a curl of the vector potential. This expression is substituted into 
Faraday’s induction law in differential form [Eq. (1.97d)] to yield Vx(E+^) = 
0. The vanishing curl implies that E + is a gradient and therefore can be 
written as — V< p, where tp(r, t) is defined as the (nonstatic) electric potential. 
These results 

9A 

B = V x A, E — V<p- 

dt 

for the B and E fields solve the homogeneous Maxwell’s equations. 


(1.99) 
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We now show that the inhomogeneous Maxwell’s equations, 


Gauss’s law: V • E = p/eo; 

1 9E 

Oersteds law: V x B-^— = p, 0 J (1.100) 

C z dt 


in differential form lead to wave equations for the potentials <p and A, provided 
that V ■ A is determined by the constraint 4 + V • A = 0. This choice of 

fixing the divergence of the vector potential is called the Lorentz gauge and 
serves to uncouple the partial differential equations of both potentials. This 
gauge constraint is not a restriction; it has no physical effect. 

Substituting our electric field solution into Gauss’s law yields 


— = V ■ E = V 2 cp - — V ■ A = V 2 <p 

So dt 


1 d 2 (p 

C*W’ 


the wave equation for the electric potential. In the last step, we used the Lorentz 
gauge to replace the divergence of the vector potential by the time derivative 
of the electric potential and thus decouple <p from A. 

Finally, we substitute B = V x A into Oersted’s law and use Eq. (1.96), 
which expands V 2 in terms of a longitudinal (the gradient term) and a trans¬ 
verse component (the curl term). This yields 


1 9E 


1 („d<p 3 A 


/x 0 J+ — = V x (V x A) = V(V • A) - V A = V-^ + — r 


C 2 dt, 


dt dt 2 


where we have used the electric field solution [Eq. (1.99)] in the last step. Now 
we see that the Lorentz gauge condition eliminates the gradient terms so that 
the wave equation 

1 3 ! A , 

lit? v A — Mo*J 

for the vector potential remains. 

Finally, looking back at Oersted’s law, taking the divergence of Eq. (1.100), 
dropping V • (V x B) = 0 and substituting Gauss’s law for V ■ E = p/eo, we 
find po V ■ J = — Aj where f 0 pn = 1/c 2 , that is, the continuity equation for 
the current density. This step justifies the inclusion of Maxwell’s displacement 
current in the generalization of Oersted’s law to nonstationary situations. ■ 


EXERCISES 

1.8.1 Verify Eq. (1.96) 

V x (V x V) = V(V • V) - (V • V)V 

by direct expansion in Cartesian coordinates. If symbolic software is 
available, check the identity for typical fields, such as V = r, r/r 3 , 

a rb, a x r. 
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1.8.2 Show that the identity 

V x (V x V) = V(V • V) - (V ■ V)V 

follows from the BAC-CAB rule for a triple vector product. Justify any 
alteration of the order of factors in the BAC and CAB terms. 

1.8.3 Prove that V x (<pV(p) = 0. 

1.8.4 Prove that (Vm) x (Vr) is solenoidal, where u and v are differentiable 
scalar functions. Start by formulating the problem as a mathematical 
equation. 

1.8.5 (p is a scalar satisfying Laplace’s equation, V 2 ^> = 0. Show that V</> is 
both solenoidal and irrotational. 

1.8.6 With \!/ a scalar function, show that 

9 „ o3 2 iA d\lr 

(r x V) • (r x V)i jr — r 2 V 2 i/ f — r 2 — 5 — 2r — . 

dr z dr 

(This can actually be shown more easily in spherical polar coordinates; 
see Section 2.5.) 

1.8.7 In the Pauli theory of the electron one encounters the expression 

(P — eA) x (p — eA)i jr, 

where \jr is a scalar function. A is the magnetic vector potential related 
to the magnetic induction B by B = V x A. Given that p = — £V, show 
that this expression reduces to ieRxjr. Show that this leads to the orbital 
(/-factor g L = 1 upon writing the magnetic moment as p, = g L h in units 
of Bohr magnetons. See also Example 1.7.1. 

1.8.8 Show that any solution of the equation 

VxVxA-/c 2 A=0 

automatically satisfies the vector Helmholtz equation 

V 2 A + k z A = 0 
and the solenoidal condition 


V • A= 0. 

Hint. Let V ■ operate on the first equation. 


1.9 Vector Integration 


The next step after differentiating vectors is to integrate them. Let us start with 
line integrals and then proceed to surface and volume integrals. In each case, 
the method of attack will be to reduce the vector integral to one-dimensional 
integrals over a coordinate interval. 
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Line Integrals 


EXAMPLE 1.9.1 


Using an increment of length dr = x dx + y dy + z dz, we often encounter the 
line integral 


L 


V • dr. 


( 1 . 101 ) 


in which the integral is over some contour C that may be open (with starting 
point and ending point separated) or closed (forming a loop) instead of an 
interval of the .x-axis. The Riemann integral is defined by subdividing the curve 
into ever smaller segments whose number grows indefinitely. The form [Eq. 
(1.101)] is exactly the same as that encountered when we calculate the work 
done by a force that varies along the path 


W = 


f Fdr =f F x (x, y, z)dx+ J F y (x, y, z)dy+ J F z (x, y, z)dz, 


( 1 . 102 ) 


that is, a sum of conventional integrals over intervals of one variable each. 
In this expression, F is the force exerted on a particle. In general, such inte¬ 
grals depend on the path except for conservative forces, whose treatment we 
postpone to Section 1.12. 

Path-Dependent Work The force exerted on a body is F = — xy + yx. The 
problem is to calculate the work done going from the origin to the point (1, 1), 

- 1,1 /■!,! 


W = 


Separating the two integrals, we obtain 

ydx- 


/»1,1 />1,J 

= F dr = 

J 0,0 J 0,0 

ti 


{—ydx + xdy). 


W = 


f 


xdy. 


(1.103) 


(1.104) 


The first integral cannot be evaluated until we specify the values of y as ,x; 
ranges from 0 to 1. Likewise, the second integral requires rasa function of y. 
Consider first the path shown in Fig. 1.27. Then 


W = 


- f 0 dx+ f 
Jo Jo 


ldy=l 


(1.105) 


because y— 0 along the first segment of the path and x = 1 along the second. 
If we select the path [x = 0,0 < y < 1] and [0 < x < 1, y = 1], then 
Eq. (1.103) gives W — — 1. For this force, the work done depends on the choice 
of path. ■ 


EXAMPLE 1.9.2 


Line Integral for Work Find the work done going around a unit circle 
clockwise from 0 to — jt shown in Fig. 1.28 in the xy -plane doing work against 
a force field given by 
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Figure 1.27 

A Path of 
Integration 


(l,D 




—-► .v 

( 1 , 0 ) 


Figure 1.28 

Circular and Square 
Integration Paths 



Let us parameterize the circle C as x = cos <p, y — sin <p with the polar angle 
(p so that dx — — sin <pd<p, dy — cos cpdip. Then the force can be written as 
F = —x sin <p + y cos <p. The work becomes 


-L 


xdy — ydx 


x* 


y^ 


f "(-si 

JO 


sirr (p — cos^ <p) d<p = n. 
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Here we spend energy. If we integrate anticlockwise from cp = 0 to n we find 
the value — n because we are riding with the force. The work is path dependent, 
which is consistent with the physical interpretation that F • dr ~ xdy — ydx — 
L z is proportional to the ^-component of orbital angular momentum (involving 
circulation, as discussed in Section 1.7). 

If we integrate along the square through the points (±1, 0), (0, —1) sur¬ 
rounding the circle, we find for the clockwise lower half square path of Fig. 
1.28 


-/ F 


• dr = 


/»—1 /»—1 /’O 

- / Fydy\ x= i- / F x dx\y=-i — / F v dy \ x= -1 
Jo J 1 J-l 

f 1 dy C 1 dx r° dy 

Jo 1 + y 2 + J_i x 2 + (—T) 2 + J-i (—l) 2 + y 2 

= arctan(l) + arctan(l) — arctan(— 1) — arctan(— 1) 

. * 

= 4 'i =7r ’ 


which is consistent with the circular path. 

For the circular paths we used the x = cos <p, y — sin <p parameterization, 
whereas for the square shape we used the standard definitions y = f(x) or 
x = g(y) of a curve, that is, y — —1 = const, and x = ±1 = const. We could 
have used the implicit definition F(pc, y) = x 2 + y 2 — 1 = 0 of the circle. Then 
the total variation 


3 F 3 F 

dF = —dx H- dy — 2xdx + 2ydy = 0 

dx dy 


so that 


dy = —xdx/y with y = — V1 — x 2 


on our half circle. The work becomes 


f xdy—ydx f ( x 2 \ f dx f 1 dx 

J c x 2 + y 2 J\y +y ) J!: Jy Ji -vT^ 


X* 


n 


— arcsin 1 — arcsin(— 1) = 2 ■ — = n, 

Ll 


in agreement with our previous results. 


EXAMPLE 1.9.3 


Gravitational Potential If a force can be described by a scalar function Vo 
asF = — V Vo (r) [Eq. (1.65)], everywhere we call Vo its potential in mechanics 
and engineering. Because the total variation d Vo = Wg • dr = —Fg • dr is 
the work done against the force along the path dr, the integrated work along 
any path from the initial point r 0 to the final point r is given by a line integral 
fr 0 = Vg( r) — Ve(ro), the potential difference between the end points of 
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the path. Thus, to find the scalar potential for the gravitational force on a unit 
mass mi, 


Gmim 2 r kr 

Fq = - 5 -= —77 j radially inward 

we integrate from infinity, where V G is zero into position r. We obtain 


(1.106) 


V G (r) - V G (oo) = - 



F g • dr — + 



F g 


• dr. 


(1.107) 


By use of F G = — F app iied, the potential is the work done in bringing the unit 
mass in from infinity. (We can define only the potential difference. Here, we 
arbitrarily assign infinity to be a zero of potential.) Since F G is radial, we obtain 
a contribution to V G only when dr is radial or 


kdr k GTO 1 TO 2 

Va(f) = - / — 5 - = — =- 

Jr r r 


(1.108) 


The negative sign reflects the attractive nature of gravity. ■ 


Surface Integrals 

Surface integrals appear in the same forms as line integrals, the element of 
area also being a vector, dcr . 13 Often this area element is written nd A, where 
n is a unit (normal) vector to indicate the positive direction . 14 There are two 
conventions for choosing the positive direction. First, if the surface is a closed 
surface, we agree to take the outward normal as positive. Second, if the surface 
is an open surface, the positive normal depends on the direction in which the 
perimeter of the open surface is traversed. If the right-hand fingers are curled 
in the direction of travel around the perimeter, the positive normal is indicated 
by the thumb of the right hand. As an illustration, a circle in the .ry-plane 
(Fig. 1.29) mapped out from x to y to — x to —y and back to x will have its 
positive normal parallel to the positive 2 -axis (for the right-handed coordinate 
system). 

Analogous to the line integrals, Eq. (1.101), surface integrals may appear 
in the form 

J V ■ dcr. (1.109) 

This surface integral / V ■ dcr may be interpreted as a flow or flux through 
the given surface. This is really what we did in Section 1.6 to understand the 
significance of the concept of divergence. Note that both physically and from 
the dot product the tangential components of the velocity contribute nothing 
to the flow through the surface. 


13 Recall that in Section 1.3 the area (of a parallelogram) is represented by a cross product vector. 
14 Although n always has unit length, its direction may well be a function of position. 
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Figure 1.29 

Right-Hand Rule for 
the Positive Normal 



Figure 1.30 

The Parabola 
y — x 2 for 
0 < y < 1 Rotated 
About the //-Axis 



EXAMPLE 1.9.4 


Moment of Inertia Let us determine the moment of inertia I y of a segment 
of the parabola y = x 2 cut off by the line y = 1 and rotated about the t/-axis 
(Fig. 1.30). We find 


r 1 r l „ f 1 ( x 3 x 5 

Iy — 2/i / x dxdy=2n / (1 — x' )x dx — 2y I —-— 

J%=0 Jy=x 2 Jo \ O 5 


The factor of 2 originates in the reflection symmetry of x -* —x, and /i is the 
constant mass density. ■ 
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A surface in three-dimensional space may be explicitly given as z — fix, y) 
or by the coordinate functions of its points 

x = xiu, vf y = y(u, t>), z = ziu, v ) 

in terms of two parameters u, v or in implicit form F(x, y, z) = 0. The explicit 
form is a special case 


F(x, y,z) = z — f{x, y) 

of the general implicit definition of a surface. We find the area dA = dxdy/n z 
over the projection dxdy of the surface onto the .ry-plane for the latter case. 
Here, n z = cos y is the ^-component of the normal unit vector n at r on the 
surface so that y is the angle of dA with the .:t;vy-f)lan('. Thus, when we project 
dA to the xy -plane, we get dA cos y = dxdy, which proves this useful formula 
for measuring the area of a curved surface. From the gradient properties 
we also know that n = V f/^JVf 2 . 


EXAMPLE 1.9.5 


A Surface Integral Here we apply the general formula for surface integrals 
to find the area on z= xy — fix, y) cut out by the unit circle in the .riy-plane 
shown in Fig. 1.31. We start from 


3 f _dz _ df _ dz _ df _ dz 

dx dx 3 y dy ’ dz dz 

which we substitute into 


n z = 1/ 



2 


for the normal to yield the area 


/» X— 1 ny='J\—x 2 

A = / _\/i + 

Jx=—\ J y=—\J 1—x 2 


x 2 + y 2 dxdy. 


Figure 1.31 

The Surface z — xy 
Above and Below the 
Unit Circle 
x 2 +y 2 = l 


Z 
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For the circular geometry plane polar coordinates r, ip are more appropri¬ 
ate, where the radial integral is evaluated by substituting u = 1 + r 2 in 

A= f V1+ r 2 rdr f dip = n f y/udu = + f 2 ) 3/2 |n = ^-(2V2 — 1). 

Jo Jo J 3 o 


More examples of line and surface integrals are provided in Chapter 2. 


Volume Integrals 


Volume integrals are simpler because the volume element dr is a scalar 
quantity. 16 We have 


/ Vdr = x / V x dr + y / V y dr + z / 

«/ y «/ y ./ y Jy 


V,dr, 


( 1 . 110 ) 


again reducing the vector integral to a vector sum of scalar integrals. 

If the vector 

V = V p (p, <p, z)p + V v (j>, cp, z)Jp + V z (j), <p, z )z 

and its components are given in cylindrical coordinates x — p cos q), y — 
p sin ip with volume element dr — p dp dip dz, the volume integral 


l Vd ’ = ± l v ^ + III 


(V p p + V v (p)p dp dip dz 


involves integrals over the varying unit vectors of the polar coordinates. To 
reduce them to scalar integrals, we need to expand the polar coordinate unit 
vectors in Cartesian unit vectors as follows. Dividing the plane coordinates by 
p, we find 

p = —{x, y) = (cos< p, sin ip) = xcos ip + ysimp. 

P 

Differentiating p 2 = 1, we see from 0 = = 2p ■ ^ that 

dp 


dip 


— — x sin ip + y cos ip = ip 


is perpendicular to p and a unit vector; therefore, it is equal to Cp. Substituting 
these expressions into the second integral yields the final result 


Jvdr = zJ^V z dr+± JJJ 


[V p cos ip — V v sin i p]p dp dip dz 


iff 


[V p sin ip + V v cos ip\p dp dip dz. 


( 1 . 111 ) 


The terms in brackets are the Cartesian components V x , V y expressed in plane 
polar coordinates. 


1 B Frequently, the symbols dJr and d 3 x are used to denote a volume element in coordinate (xyz or 
X 1 X 2 X 3 ) space. 
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In spherical polar coordinates, all of the unit vectors depend on the coor¬ 
dinates, none can be pulled out of the integrals, and all have to be expanded in 
Cartesian unit vectors. This task of rewriting Eq. (1.110) is left as an exercise. 


EXAMPLE 1.9.6 


Volume of Rotated Gaussian Rotate the Gaussian y — exp(— x 2 ) about the 
s-axis leading to z = exp(— x 2 — y 2 ). Then the volume in the polar (cylindrical) 
coordinates appropriate for the geometry is given by 


poo /»2 n pe 

v = 

Jr =0 J (p =0 J & 


r dr dip dz 


r =0 J (p =0 J z= 0 


f°° _ 2 f°° 

= 2 7T J re r dr = n / e U du = tt, 

Jo Jo 


upon substituting exp(— x 2 — y 2 ) = exp(—r 2 ), dxdy — r dr dip, u — r 2 , and 
du = 2 rdr. ■ 


Integral Definitions of Gradient, Divergence, and Curl 

One interesting and significant application of our surface and volume integrals 
is their use in developing alternate definitions of our differential relations. We 
find 


f ip da 

Vip — lim — -, 

/ dz->o J dr 

(1.112) 

_ Tr f V • da 

f dr->-0 J dr 

(1.113) 

[da x V 

V x V = lim -— -. 

/ dr —>-0 J dr 

(1.114) 


In these three equations, / dr is the volume of a small region of space and 
da is the vector area element of this volume. The identification of Eq. (1.113) 
as the divergence of V was carried out in Section 1.6. Here, we show that 
Eq. (1.112) is consistent with our earlier definition of Vip [Eq. (1.64)]. For 
simplicity, we choose dr to be the differential volume dxdydz (Fig. 1.32). This 


Figure 1.32 

Differential 
Rectangular 
Parallelepiped 
(Origin at Center) 
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time, we place the origin at the geometric center of our volume element. The 
area integral leads to six integrals, one for each of the six faces. Remembering 
that dc t is outward, dcr • x = — \da\ for surface EFHG, and + \da\ for surface 
ABDC, we have 


/ 


<pda = 


. f ( d<P dx „ 

—x / ( <p — -—— | dydz+x 

Je 


J EFHG 


dx 2 


L 


(P+^^T ) dydz 

ABDC \ dX 2 


■/ 

Jai 


-y / \v 

■Jaegc 


d(p dy \ H H _L - f 

— — )dxdz + y / 

By 2 J J B 


BFHD 


-z f (<P- '^^^dxdy+z f 
Jabfe V dz 2 / Jc 


CDHG 


d( P dy \ w 
- dxdz 

3 y 2 ) 

d( p dz \ J w 

TzF) dxdy - 


Using the first two terms of a Maclaurin expansion, we evaluate each integrand 
at the origin with a correction included to correct for the displacement (±dx/2, 
etc.) of the center of the face from the origin. Having chosen the total volume 
to be of differential size (J dr = dxdydz ), we drop the integral signs on the 
right and obtain 


Dividing by 




dxdydz. 


(1.115) 


Jdr = dxdydz , 

we verify Eq. (1.112). 

This verification has been oversimplified in ignoring other correction terms 
beyond the first derivatives. These additional terms, which are introduced in 
Section 5.6 when the Taylor expansion is developed, vanish in the limit 


/ 


dr 0 (dx -> 0 ,dy-+ 0 , dz 0 ). 


This, of course, is the reason for specifying in Eqs. (1.112)—(1.114) that this 
limit be taken. Verification of Eq. (1.114) follows these same lines, using a 
differential volume dr = dxdydz. 


EXERCISES 

1.9.1 Find the potential for the electric field generated by a charge q at the 
origin. Normalize the potential to zero at spatial infinity. 

1.9.2 Determine the gravitational field of the earth taken to be spherical and of 
uniform mass density. Punch out a concentric spherical cavity and show 
that the field is zero inside it. Show that the field is constant if the cavity 
is not concentric. 
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1.9.3 Evaluate 



over the unit cube defined by the point (0, 0, 0) and the unit intercepts 
on the positive x-, y-, and 2 -axes. Note that (a) r ■ da is zero for three of 
the surfaces, and (b) each of the three remaining surfaces contributes 
the same amount to the integral. 

1.9.4 Show by expansion of the surface integral that 



fdz^ o / dr 

Hint. Choose the volume to be a differential volume, dxdydz. 



Here, we derive a useful relation between a surface integral of a vector and 
the volume integral of the divergence of that vector. Let us assume that the 
vector V and its first derivatives are continuous over the simply connected 
region (without holes) of interest. Then, Gauss’s theorem states that 



(1.116a) 


In words, the surface integral of a vector over a closed surface equals the 
volume integral of the divergence of that vector integrated over the volume 
enclosed by the surface. 

Imagine that volume V is subdivided into an arbitrarily large number of 
tiny (differential) parallelepipeds. For each parallelepiped, 



V • da = V ■ V dr 


(1.116b) 


six surfaces 


from the analysis of Section 1.6, Eq. (1.75), with pv replaced by V. The sum¬ 
mation is over the six faces of the parallelepiped. Summing over all paral¬ 
lelepipeds, we find that the V• da terms cancel (pairwise) for all interior faces; 
only the contributions of the exterior surfaces survive (Fig. 1.33). Analogous 
to the definition of a Riemann integral as the limit of a sum, we take the limit as 
the number of parallelepipeds approaches infinity (-* oo) and the dimensions 
of each approach zero (-+ 0): 




exterior surfaces 


volumes 



The result is Eq. (1.116a), Gauss’s theorem. 








1.10 Gauss’s Theorem 
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Figure 1.33 

Exact Cancellation 
of V • da 's on 
Interior Surfaces. 

No Cancellation on 
the Exterior Surface 



From a physical standpoint, Eq. (1.75) has established V • V as the net 
outflow of held per unit volume. The volume integral then gives the total net 
outflow. However, the surface integral / V • da is just another way of expressing 
this same quantity, which is the equality, Gauss’s theorem. 


Biographical Data 

Gauss, Carl Friedrich. Gauss, a German mathematician, physicist, and 
astronomer, was bom in Brunswick in 1777 and died in Gottingen in 1855. 
He was an infant prodigy in mathematics whose education was directed and 
financed by the Duke of Brunswick. As a teenager, he proved that regular 
n-polygons can be constmcted in Euclidean geometry provided n is a Fer¬ 
mat prime number such as 3, 5, 17, and 257, a major advance in geometry 
since antiquity. This feat convinced him to stay in mathematics and give up 
the study of foreign languages. For his Ph.D., he proved the fundamental 
theorem of algebra, avoiding the then controversial complex numbers he 
had used to discover it. In his famous treatise Disquisitiones Arithmetica 
on number theory, he first proved the quadratic reciprocity theorem and 
originated the terse style and rigor of mathematical proofs as a series of 
logical steps, discarding any trace of the original heuristic ideas used in the 
discovery and checks of examples. Not surprisingly, he hated teaching. He 
is considered by many as the greatest mathematician of all times and was 
the last to provide major contributions to all then existing branches of math¬ 
ematics. As the founder of differential geometry, he developed the intrinsic 
properties of surfaces, such as curvature, which later motivated B. Riemann 
to develop the geometry of metric spaces, the mathematical foundation 
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of Einstein’s General Relativity. In astronomy (for the orbit of the asteroid 
Ceres), he developed the method of least squares for fitting curves to data. 
In physics, he developed potential theory, and the unit of the magnetic in¬ 
duction is named after him in honor of his measurements and development 
of units in physics. 


Green’s Theorem 


A frequently useful corollary of Gauss’s theorem is a relation known as Green’s 
theorem. If u and v are two scalar functions, we have the identities 


V-(!iVw) = MV-Vu + (Vtt)-(V«), (1.117) 

V-(wVm) = wV-Vm+(Vu)-(V«), (1.118) 


which follow from the product rule of differentiation. Subtracting Eq. (1.118) 
from Eq. (1.117), integrating over a volume (u, v, and their derivatives, assumed 
continuous), and applying Eq. (1.116a) (Gauss’s theorem), we obtain 


X 


(uV ■ Vv 


vV ■ Vu)dr = / (uVv 
Js 


v^7u) ■ dcr. 


(1.119) 


This is Green’s theorem, which states that the antisymmetric Laplacian of 
a pair of functions integrated over a simply connected volume (no holes) is 
equivalent to the antisymmetric gradient of the pair integrated over the bound¬ 
ing surface. An alternate form of Green’s theorem derived from Eq. (1.117) 
alone is 


/ uVv-dcr= / uV-Vvdr + I Vu-Vvdr. (1.120) 

Js Jv Jv 

Finally, Gauss’s theorem may also be extended to tensors (see Section 2.11). 


X 


Biographical Data 

Green, George. Green, an English mathematician, was born in Nottingham 
in 1793 and died near Nottingham in 1841. He studied Laplace’s papers in 
Cambridge and developed potential theory in electrodynamics. 


EXERCISES 

1.10.1 IfB = V x A, show that 

I B • da- = 0 
Js 

for any closed surface S. State this in words. If symbolic software is 
available, check this for a typical vector potential and specific surfaces, 
such as a sphere or cube. 

1.10.2 Over some volume V, let ^bea solution of Laplace’s equation (with the 
derivatives appearing there continuous). Prove that the integral over 
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any closed surface in V of the normal derivative of ij/ (7ti///3n, or V i//-n) 
will be zero. 


1.10.3 In analogy to the integral definition of gradient, divergence, and curl of 
Section 1.10, show that 


v 2 <p 


f V<p ■ da 

lim -— - - 

fdz^o J dr 


1.10.4 The electric displacement vector D satisfies the Maxwell equation 
V • D = p, where p is the charge density (per unit volume). At the 
boundary between two media there is a surface charge density a (per 
unit area). Show that a boundary condition for D is 


(D 2 - Di) • n = a, 

where n is a unit vector normal to the surface and out of medium 1. 
Hint. Consider a thin pillbox as shown in Fig. 1.34. 


Figure 1.34 
Pillbox 



1.10.5 From Eq. (1.77) and Example 1.6.1, with V the electric held E and f 
the electrostatic potential <p, show that 

/ P ( pdr = e o / E 2 dr. 

This corresponds to a three-dimensional integration by parts. 

Hint. E = —V<p, V -E = p/eo. You may assume that ^vanishes at large 
r at least as fast as r . 


1.10.6 The creation of a localized system of steady electric currents (current 
density J) and magnetic fields may be shown to require an amount of 
work 


Transform this into 




J H • B dr. 
J J ■ Adr, 


where A is the magnetic vector potential, V x A = B. 
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Hint. In Maxwell’s equations, take the displacement current term 
3D/3 1 = 0 and explain why using Ohm’s law. If the fields and cur¬ 
rents are localized, a bounding surface may be taken far enough out 
so that the integrals of the fields and currents over the surface yield 
zero. 



Gauss’s theorem relates the volume integral of a derivative of a function to an 
integral of the function over the closed surface bounding the volume. Here, we 
consider an analogous relation between the surface integral of a derivative of 
a function and the line integral of the function, the path of integration being 
the perimeter bounding the surface. 

Let us take the surface and subdivide it into a network of arbitrarily small 
rectangles. In Section 1.7, we showed that the circulation about such a differ¬ 
ential rectangle (in the xy -plane) is V x V\ z dxdy. From Eq. (1.85) applied to 
one differential rectangle, 



( 1 . 121 ) 


four sides 


We sum over all the little rectangles as in the definition of a Riemann in¬ 
tegral. The surface contributions [right-hand side of Eq. (1.121)] are added 
together. The line integrals [left-hand side of Eq. (1.121)] of all interior line 
segments cancel identically. Only the line integral around the perimeter sur¬ 
vives (Fig. 1.35). Taking the usual limit as the number of rectangles approaches 

Figure 1.35 


Exact Cancellation 
on Interior Paths; 
No Cancellation on 
the Exterior Path 



t 


I 
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infinity while dx —*■ 0, dy —*■ 0, we have 

J2 VxV-d<T 

rectangles 

1 

fvxVder. (1.122) 

Js 

This is Stokes’s theorem. The surface integral on the right is over the surface 
bounded by the perimeter or contour for the line integral on the left. The 
direction of the vector representing the area is out of the paper plane toward 
the reader if the direction of traversal around the contour for the line integral 
is in the positive mathematical sense as shown in Fig. 1.35. 

This demonstration of Stokes’s theorem is limited by the fact that we used 
a Maclaurin expansion of V(pc, y, z) in establishing Eq. (1.85) in Section 1.7. 
Actually, we need only demand that the curl of V(.r, y, z) exists and that it be 
integrable over the surface. Stokes’s theorem obviously applies to an open, 
simply connected surface. It is possible to consider a closed surface as a lim¬ 
iting case of an open surface with the opening (and therefore the perimeter) 
shrinking to zero. This is the point of Exercise 1.11.4. 

As a special case of Stokes’s theorem, consider the curl of a two-dimensional 
vector field V = (\\(x, y), V 2 (x, y), 0). The curl V x V = (0, 0, ^ - ||) so 

f f (dV 2 9VA f f 

/ V x V ■ z dxdy = /- dxdy = / V • dr = / (1 \dx + V 2 dy), 

Js Js V dx dy J Jc Jc 

where the curve C is the boundary of the simply connected surface S that is 
integrated in the positive mathematical sense (anticlockwise). This relation is 
sometimes also called Green’s theorem. In Chapter 6, we shall use it to prove 
Cauchy’s theorem for analytic functions. 


J 2 V ■ d\ = 


exterior line 
segments 


VdA = 


EXAMPLE 1.11.1 


Area as a Line Integral For the two-dimensional Stokes’s theorem, we first 
choose V = xy, which gives the area S — f s dxdy — f c xdy, and for V = (?/x) 
we get similarly S = f s dxdy = — f ( , ydx. Adding both results gives the area 


S 


1 

2 



— ydx~). ■ 


We can use Stokes’s theorem to derive Oersted’s and Faraday’s laws from 
two of Maxwell’s equations and vice versa, thus recognizing that the former 
are an integrated form of the latter. 


EXAMPLE 1.11.2 


Oersted’s and Faraday’s Laws Consider the magnetic field generated by a 
long wire that carries a stationary current I (Fig. 1.36). Starting from Maxwell’s 
differential law VxH = J [Eq. (1.97c); with Maxwell’s displacement current 
9D/9f = 0 for a stationary current case by Ohm’s law], we integrate over a 
closed area S perpendicular to and surrounding the wire and apply Stokes’s 
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SUMMARY 


Figure 1.36 

Oersted’s Law for a 
Long Wire Carrying 
a Current 



Figure 1.37 

Faraday’s Induction 
Law Across a 
Magnetic Induction 
Field 



theorem to get 


I = f J • da = [ (V x H) ■ 
Js Js 


dcr — <p H 

as 


dr, 


which is Oersted’s law. Here, the line integral is along 35, the closed curve 
surrounding the cross section area S. 

Similarly, we can integrate Maxwell’s equation for V x E [Eq. (1.97d)] to 
yield Faraday’s induction law. Imagine moving a closed loop (3 S) of wire (of 
area S) across a magnetic induction field B (Fig. 1.37). At a fixed moment of 
time we integrate Maxwell’s equation and use Stokes’s theorem, yielding 

r r dr d<t> 

/ E • dr = / (V x E ) ■ der = -/ B dcr — -, 

Jas Js dt Js dt 

which is Faraday’s law. The line integral on the left-hand side represents the 
voltage induced in the wire loop, whereas the right-hand side is the change 
with time of the magnetic flux <$> through the moving surface S of the wire. ■ 


Both Stokes’s and Gauss’s theorems are of tremendous importance in a wide 
variety of problems involving vector calculus in electrodynamics, where they 
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allow us to derive the local form of Maxwell’s differential equations from the 
global (integral) form of the experimental laws. An indication of their power 
and versatility may be obtained from the exercises in Sections 1.10 and 1.11 
and the development of potential theory in Section 1.12. 


Biographical Data 

Stokes, Sir George Gabriel. Stokes, a British mathematician and physi¬ 
cist, was bom in Skreen, Ireland, in 1819 and died in Cambridge in 1903. Son 
of a clergyman, his talent for mathematics was already evident in school. In 
1849, he became Lucasian professor at Cambridge, the chair Isaac Newton 
once held and currently held by S. Hawking. In 1885, he became president of 
the Royal Society. He is known for the theory of viscous fluids, with practical 
applications to the motion of ships in water. He demonstrated his vision by 
hailing Joule’s work early on and recognizing X-rays as electromagnetic radi¬ 
ation. He received the Rumford and Copley medals of the Royal Society and 
served as a member of Parliament for Cambridge University in 1887-1892. 


EXERCISES 

1.11.1 The calculation of the magnetic moment of a current loop leads to the 
line integral 

r x dr. 

(a) Integrate around the perimeter of a current loop (in the .xy-plane) 
and show that the scalar magnitude of this line integral is twice the 
area of the enclosed surface. 

(b) The perimeter of an ellipse is described by r = xa cos 9 +yb sin 9. 
From part (a), show that the area of the ellipse is nab. 

1.11.2 In steady state, the magnetic held H satisfies the Maxwell equation 
V x H = J, where J is the current density (per square meter). At 
the boundary between two media there is a surface current density K 
(perimeter). Show that a boundary condition on H is 

n x (H 2 - HO = K, 

where n is a unit vector normal to the surface and out of medium 1. 
Hint. Consider a narrow loop perpendicular to the interface as shown 
in Fig. 1.38. 

Figure 1.38 
Loop Contour 


i 

n 

- 4 - 

medium 2 



medium 1 
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1 . 11.3 A magnetic induction B is generated by electric current in a ring of 
radius R. Show that the magnitude of the vector potential A (B = 
V x A) at the ring is 


|A| 


O 

2jtR’ 


where <t> is the total magnetic flux passing through the ring. 
Note. A is tangential to the ring. 


1 . 11.4 Prove that 



x V • dcr = 0 


if S is a closed surface. 


1 . 11.5 Prove that 


uV v ■ d A = 


rVit- d\. 


1 . 11.6 Prove that 


<j)u'Vv-d\= J (Vu)x(Vv) 


dcr. 


D 


1.12 Potential Theory 


Scalar Potential 


This section formulates the conditions under which a force field F is con¬ 
servative. From a mathematical standpoint, it is a practice session of typical 
applications of Gauss’s and Stokes’s theorems in physics. 

If a force in a given simply connected region of space V (i.e., no holes in 
it) can be expressed as the negative gradient of a scalar function <p, 


F = V<p, (1.123) 

we call (p a scalar potential that describes the force by one function instead of 
three, which is a significant simplification. A scalar potential is only determined 
up to an additive constant, which can be used to adjust its value at infinity 
(usually zero) or at some other point. The force F appearing as the negative 
gradient of a single-valued scalar potential is labeled a conservative force. We 
want to know when a scalar potential function exists. To answer this question, 
we establish two other relations as equivalent to Eq. (1.123): 


V x F = 0 


(1.124) 


and 


F • dr = 0, 


(1.125) 
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for every closed path in our simply connected region V. We proceed to show 
that each of these three equations implies the other two. Let us start with 

F = —'Vip. (1.126) 


Then 


V x F = — V x Vip = 0 


(1.127) 


by Eq. (1.92), or Eq. (1.123) implies Eq. (1.124). Turning to the line integral, 
we have 


F ■ dr = 


Vip ■ dr = 


dip 


(1.128) 


using Eq. (1.58). Now dip integrates to give ip. Because we have specified a 
closed loop, the end points coincide and we get zero for every closed path in 
our region S for which Eq. (1.123) holds. It is important to note the restriction 
that the potential be single-valued and that Eq. (1.123) hold for all points in 
S. This derivation may also apply to a scalar magnetic potential as long as no 
net current is encircled. As soon as we choose a path in space that encircles a 
net current, the scalar magnetic potential ceases to be single-valued and our 
analysis no longer applies because V is no longer simply connected. 

Continuing this demonstration of equivalence, let us assume that Eq. (1.125) 
holds. If / F ■ dr = 0 for all paths in S, the value of the integral joining two 
distinct points A and B is independent of the path (Fig. 1.39). Our premise is 
that 


Therefore, 


F • dr = 0. 


ACBDA 


F ■ dr = 


ACB 


- I F ■ dr = I F ■ dr, 

Jbda Jadb 


(1.129) 


(1.130) 


Figure 1.39 

Possible Paths for 
Doing Work 
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SUMMARY 


reversing the sign by reversing the direction of integration. Physically, this 
means that the work done in going from A to B is independent of the path and 
that the work done in going around a closed path is zero. This is the reason 
for labeling such a force conservative: Energy is conserved. 

With the result shown in Eq. (1.130), we have the work done dependent 
only on the end points A and B. That is, 

Work done by force = / F • dr = <p(A.) — <p(B). (1.131) 

Ja 

Equation (1.131) defines a scalar potential (strictly speaking, the difference 
in potential between points A and B ) and provides a means of calculating the 
potential. If point B is taken as a variable such as (x, y, z), then differentiation 
with respect to x, y, and z will recover Eq. (1.123). 

The choice of sign on the right-hand side is arbitrary. The choice here is 
made to achieve agreement with Eq. (1.123) and to ensure that water will run 
downhill rather than uphill. For points A and B separated by a length dr, Eq. 
(1.131) becomes 


F • dr = —d(p = — V</> ■ dr. 

This may be rewritten 


(F + V<p) ■ dr = 0, 


and since dr ^ 0 is arbitrary, Eq. (1.126) must follow. If 


F • dr = 0, 


(1.132) 


(1.133) 


(1.134) 


we may obtain Eq. (1.123) by using Stokes’s theorem [Eq. (1.122)]: 

j) F ■ dr = J V x F • dcr. (1.135) 

If we take the path of integration to be the perimeter of an arbitrary differential 
area dcr, the integrand in the surface integral must vanish. Hence, Eq. (1.125) 
implies Eq. (1.123). 

Finally, if V x F = 0, we need only reverse our statement of Stokes’s 
theorem [Eq. (1.135)] to derive Eq. (1.125). Then, by Eqs. (1.131)—(1.133) the 
initial statement F = — V<p is derived. The triple equivalence is illustrated in 
Fig. 1.40. 


A single-valued scalar potential function <p exists if and only if F is irrotational 
so that the work done around every closed loop is zero. The gravitational 
and electrostatic force fields given by Eq. (1.88) are irrotational and there¬ 
fore conservative. Gravitational and electrostatic scalar potentials exist. Now, 
by calculating the work done [Eq. (1.131)], we proceed to determine three 
potentials (Fig. 1.41). 
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Figure 1.40 

Equivalent 
Formulations of a 
Conservative Force 



Figure 1.41 

Potential Energy 
Versus Distance 
(Gravitational, 
Centrifugal, and 
Simple Harmonic 
Oscillator) 



EXAMPLE 1.12.1 


Centrifugal Potential Calculate the scalar potential for the centrifugal 
force per unit mass, Fc = co 2 r, radially outward. Physically, the centrifugal 
force is what you feel when on a merry-go-round. Proceeding as in Example 
1.9.3, but integrating from the origin outward and taking cpd 0) = 0, we have 

f r a> 2 r 2 

<Pc(r ) = - J Fc ■ dr — -—. 

If we reverse signs, taking F S ho = — kr, we obtain <p S HO = \kr 2 , the simple 
harmonic oscillator potential. 

The gravitational, centrifugal, and simple harmonic oscillator potentials 
are shown in Fig. 1.41. Clearly, the simple harmonic oscillator yields stability 
and describes a restoring force. The centrifugal potential describes an unstable 
situation. ■ 
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When a vector B is solenoidal, a vector potential A exists such that B = V x A. 
A is undetermined to within an additive gradient of a scalar function. This is 
similar to the arbitrary zero of a potential, due to an additive constant of the 
scalar potential. 

In many problems, the magnetic vector potential A will be obtained from 
the current distribution that produces the magnetic induction B. This means 
solving Poisson’s (vector) equation (see Exercise 1.13.4). 


EXERCISES 

1.12.1 The usual problem in classical mechanics is to calculate the motion of 
a particle given the potential. For a uniform density (po), nonrotating 
massive sphere, Gauss’s law (Section 1.10) leads to a gravitational force 
on a unit mass Too at a point ro produced by the attraction of the mass 
at r < ro. The mass at r > ro contributes nothing to the force. 

(a) Show that F/mo = — (47T Gpo/3)r, 0 < r < a, where a is the radius 
of the sphere. 

(b) Find the corresponding gravitational potential, 0 < r < a. 

(c) Imagine a vertical hole running completely through the center of 
the earth and out to the far side. Neglecting the rotation of the earth 
and assuming a uniform density po = 5.5 g/cm 3 , calculate the nature 
of the motion of a particle dropped into the hole. What is its period? 
Note. F a r is actually a very poor approximation. Because of 
varying density, the approximation F = constant, along the outer 
half of a radial line, and F oc r, along the inner half, is much closer. 

1.12.2 The origin of the Cartesian coordinates is at the earth’s center. The moon 
is on the 2 -axis, afixed distance R away (center-to-center distance). The 
tidal force exerted by the moon on a particle at the earth’s surface (point 
x, y, z) is given by 



OC u z 

F x = -GMm — i , F y = -GMmj^, F z = +2GMm—^. 


Find the potential that yields this tidal force. 



In terms of the Legendre polynomials of 
Chapter 11, this becomes 



1.12.3 Vector B is formed by the product of two gradients 


B = (Vu) x (Vu), 


where u and v are scalar functions. 
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(a) Show that B is solenoidal. 

(b) Show that 


A = -(mVd — v Vm) 

t-i 


is a vector potential for B in that 


B = V x A. 


1.12.4 The magnetic induction B is related to the magnetic vector potential A 
by B = V x A. By Stokes’s theorem, 



Show that each side of this equation is invariant under the gauge trans¬ 
formation, A —► A + VA, where A is an arbitrary scalar function. 
Note. Take the function A to be single-valued. 

1.12.5 With E as the electric field and A as the magnetic vector potential, show 
that [E + 3A/3f] is irrotational and that we may therefore write 


1.12.6 The total force on a charge q moving with velocity v is 


F = g(E + v x B). 


Using the scalar and vector potentials, show that 


F — q -Vcp —— + V(A-v) 


Note that we now have a total time derivative of A in place of the partial 
derivative of Exercise 1.12.5. 

1.12.7 A planet of mass m moves on a circular orbit of radius r around a star 
in an attractive gravitational potential V = kr n . Find the conditions on 
the exponent n for the orbit to be stable. 

Note. You can set k = —GmM, where M is the mass of the star, and 
use classical mechanics. Einstein’s General Relativity gives n = — 1, 
whereas in Newton’s gravitation the Kepler laws are needed in addition 
to determining that n = — 1. 





82 


Chapter 1 Vector Analysis 



1.13 Gauss’s Law and Poisson’s Equation 


I 



Consider a point electric charge q at the origin of our coordinate system. This 
produces an electric field E 16 given by 


E _ g* 

47reor 2 


(1.136) 


We now derive Gauss’s law, which states that the surface integral in Fig. 1.42 
is q /so if the closed surface S includes the origin (where q is located) and zero 


16 The electric field E is defined as the force per unit charge on a small stationaiy test charge qf. 
E = F/qt. From Coulomb’s law, the force on qt. due to q is F = (qqt/\n eo)(f/r 2 ). When we divide 
by qt, Eq. (1.136) follows. 
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Figure 1.43 


z 


Exclusion of the 
Origin 


r 



y 


x 


if the surface does not include the origin. The surface S is any closed surface; 
it need not be spherical. 

Using Gauss’s theorem [Eq. (1.116a)] (and neglecting the scale factor 
q/47reo), we obtain 



(1.137) 


by Example 1.6.1, provided the surface S does not include the origin, where 
the integrands are not defined. This proves the second part of Gauss’s law. 

The first part, in which the surface S must include the origin, may be han¬ 
dled by surrounding the origin with a small sphere S' of radius 8 (Fig. 1.43). 
So that there will be no question as to what is inside and what is outside, 
imagine the volume outside the outer surface S and the volume inside surface 
S'(r < 8 ) connected by a small hole. This joins surfaces S and S', combining 
them into one single, simply connected closed surface. Because the radius 
of the imaginary hole may be made vanishingly small, there is no additional 
contribution to the surface integral. The inner surface is deliberately chosen 
to be spherical so that we will be able to integrate over it. Gauss’s theorem 
now applies to the volume between S and S' without any difficulty. We have 



(1.138) 


We may evaluate the second integral for da' = — r<5 2 dQ, in which dQ is an 
element of solid angle. The minus sign appears because we agreed in Section 
1.9 to have the positive normal r' outward from the volume. In this case, the 
outward r' is in the negative radial direction, r' = — r. By integrating over all 
angles, we have 



(1.139) 
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independent of the radius <5. With the constants from Eq. (1.136), this results 
in 

[ E ■ da = , (1.140) 

Js 47T£o Sq 

completing the proof of Gauss’s law. Notice that although the surface S may 
be spherical, it need not be spherical. 

Going a bit further, we consider a distributed charge so that 


q = [ pdr. (1.141) 

Jv 

Equation (1.140) still applies, with q now interpreted as the total distributed 
charge enclosed by surface S : 


f E ■ da = f 

Js Jv 


= I — dr. 
v £ o 


(1.142) 


Using Gauss’s theorem, we have 


( V ■ Edr = [ — dr. (1.143) 

Jv Jv £ o 

Since our volume is completely arbitrary, the integrands must be equal or 

P 


V -E = 


£o 


(1.144) 


one of Maxwell’s equations. If we reverse the argument, Gauss’s law follows 
immediately from Maxwell’s equation by integration. 


Poisson’s Equation 

Replacing E by — V<t>, Eq. (1.144) becomes 

V • V<p = 


so 


which is Poisson’s equation. We know a solution, 

m = —( 

47reo J |r — r'| 


(1.145) 


from generalizing a sum of Coulomb potentials for discrete charges in electro¬ 
statics to a continuous charge distribution. 

For the condition p = 0 this reduces to an even more famous equation, the 

Laplace equation. 


V ■ v<p = 0 . 


(1.146) 


We encounter Laplace’s equation frequently in discussing various curved coor¬ 
dinate systems (Chapter 2) and the special functions of mathematical physics 
that appear as its solutions in Chapter 11. 

From direct comparison of the Coulomb electrostatic force law and 
Newton’s law of universal gravitation, 


F* = 


1 glg2 , 

47T£o T 2 


Fr, = 


-G ——r. 
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All of the potential theory of this section therefore applies equally well to 
gravitational potentials. For example, the gravitational Poisson equation is 

V ■V<p = +4jrGp, (1.147) 

with p now a mass density. 


Biographical Data 

Poisson, Simeon Denis. Poisson, a French mathematician, was born in 
Pithiviers, France in 1781 and died in Paris in 1840. He studied mathemat¬ 
ics at the Ecole Polytechnique under Laplace and Lagrange, whom he so 
impressed with his talent that he became professor there in 1802. He con¬ 
tributed to their celestial mechanics, Fourier’s heat theory, and probability 
theory, among others. 


EXERCISES 


1.13.1 Develop Gauss’s law for the two-dimensional case in which 


<P = ~Q 


In p 
2:x sq 


E = —= q 


P 

2ttsop’ 


where q is the charge at the origin or the line charge per unit length if the 
two-dimensional system is a unit thickness slice of a three-dimensional 
(circular cylindrical) system. The variable p is measured radially out¬ 
ward from the line charge, p is the corresponding unit vector (see 
Section 2.2). If graphical software is available, draw the potential and 
held for the q/2 jteo — 1 case. 


1.13.2 (a) Show that Gauss’s law follows from Maxwell’s equation 

V-E = — 

£o 

by integrating over a closed surface. Here, p is the charge density, 
(b) Assuming that the electric field of a point charge q is spherically 
symmetric, show that Gauss’s law implies the Coulomb inverse 
square expression 

E _ g* 

Ansor 2 

1.13.3 Show that the value of the electrostatic potential (p at any point P is 
equal to the average of the potential over any spherical surface centered 
on P. There are no electric charges on or within the sphere. 

Hint. Use Green’s theorem [Eq. (1.119)], with %t A = r, the distance 
from P, and v — <p. 


1.13.4 Using Maxwell’s equations, show that for a system (steady current) the 
magnetic vector potential A satisfies a vector Poisson equation 

V 2 A = -nJ, 

provided we require V • A = 0 in Coulomb gauge. 
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1.14 Dirac Delta Function 


From Example 1.6.1 and the development of Gauss’s law in Section 1.13, 


/ 


V ■ V | - ) dr = 

r 


-f 


V ■ — dr = 


-47T 


0 


(1.148) 


depending on whether the integration includes the origin r = 0 or not. This 
result may be conveniently expressed by introducing the Dirac delta function, 


V" = -4jrS(r) = -4 Tz8(x)8(y)8(z). 

This Dirac delta function is defined by its assigned properties 

S (x) = 0, x ^ 0 

/ OO 

f{pc) 8 (x) dx, 

-OO 


(1.149) 

(1.150) 

(1.151) 


where f(x) is any well-behaved function and the integration includes the ori¬ 
gin. As a special case of Eq. (1.151), 


f 


S (x)dx = 1. 


(1.152) 


From Eq. (1.151), S(x) must be an infinitely high, infinitely thin spike at x = 0, 
as in the description of an impulsive force or the charge density for a point 
charge. 17 The problem is that no such function exists in the usual sense 
of function. However, the crucial property in Eq. (1.151) can be developed 
rigorously as the limit of a sequence of functions, a distribution. For example, 
the delta function may be approximated by the sequences of functions in n for 
7i —> oo [Eqs. (1.153)-(1.156) and Figs. 1.44-1.47]: 


°> *<-■= 


S n (x) = 


n, 


2 n 

-7T < X < 

2 n 2 n 


0, x> l 


n 


8 n (x) = —= exp(— n 2 x 2 ) 

f7T 


„ ^ n 

fi(x) = ■ 


1 


8 n (x) = 


jt 1 + n 2 x 2 
sin nx 


7 XX 


= J r e ix 

2iTV J _ n 


dt. 


(1.153) 

(1.154) 

(1.155) 

(1.156) 


1 ' The delta function is frequently invoked to describe very short-range forces such as nuclear 
forces. It also appears in the normalization of continuum wave functions of quantum mechanics. 
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EXAMPLE 1.14.1 


Figure 1.44 

6 Sequence Function 
Eq. (1.153) 



Figure 1.45 

6 Sequence Function 
Eq. (1.154) 



Let us evaluate cos xS(x)dx — cosO = 1 using the sequence of Eq. (1.153). 

We find 


/_ 


1/2 n 


ncosxdx — nsmx^J^n = 


-1/2 n 


n sin 


1 

2 n 


sin 


1 


= 2msin — = 2 to I -—h 0(l/n ) 


2 n 


2n 


2n 


1 for n - 


oo. 
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Figure 1.46 

6 Sequence Function 
Eq. (1.155) 



Figure 1.47 

6 Sequence Function 
Eq. (1.156) 



Notice how the integration limits change in the first step. Similarly, 
f* sin xS (x) ■ dx — sinO = 0. We could have used Eq. (1.155) instead, 


C 71 cos xdx n r 71 l-x 2 /2+--- n C 71 

JLr 1 + n 2 X 2 7T J_ n 1 + n 2 X 2 X 71 J —x 1 


dx 


+ n 2 x 2 


l*njr 

77 J — nir 1 


dy 1 

= — [arctan(TC7r) — arctan(— fin)] 


y* 


t r 


2 2 7T 

= — arctan(n7t) —► — — = 1, for n—*o o, 

7t 7t 2 
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by keeping just the first term of the power expansion of cos x. Again, we could 
have changed the integration limits to ±tt /n in the first step for all terms with 
positive powers of x because the denominator is so large, except close to x = 0 
for large n. This explains why the higher order terms of the cos x power series 
do not contribute. ■ 


These approximations have varying degrees of usefulness. Equation (1.153) 
is useful in providing a simple derivation of the integral property [Eq. (1.151)]. 
Equation (1.154) is convenient to differentiate. Its derivatives lead to the 
Hermite polynomials. Equation (1.156) is particularly useful in Fourier analysis 
and in its applications to quantum mechanics. In the theory of Fourier series, 
Eq. (1.156) often appears (modified) as the Dirichlet kernel: 


SnQxf) = 


1 sin[(n+ g)x] 
2 jt sin(ga;) 


(1.157) 


In using these approximations in Eq. (1.151) and later, we assume that f(x) is 
integrable—it offers no problems at large x. 

For most physical purposes such approximations are quite adequate. From 
a mathematical standpoint, the situation is still unsatisfactory: The limits 


lim 8 n (x ) 

n-^-oo 


do not exist. 

A way out of this difficulty is provided by the theory of distributions. Rec¬ 
ognizing that Eq. (1.151) is the fundamental property, we focus our attention 
on it rather than on 8(x). Equations (1.153)—(1.156), with n =1,2,3,..., may 
be interpreted as sequences of normalized functions: 



(1.158) 


The sequence of integrals has the limit 


lim 

n-^-oo 



8 n (x)f(x)dx = /(0). 


(1.159) 


Note that Eq. (1.158) is the limit of a sequence of integrals. Again, the limit of 
8 n (x), n —> oo, does not exist. [The limits for all four forms of 8 n (x) diverge at 
* = 0 .] 

We may treat 8 (x) consistently in the form 


r 


8{x)f(x)dx = lim / 8 n (x)f(x)dx. 


f 


(1.160) 


8 (pc) is labeled a distribution (not a function) defined by the sequences S n (x) as 
indicated in Eq. (1.158). We might emphasize that the integral on the left-hand 
side of Eq. (1.160) is not a Riemann integral. 18 It is a limit. 


18 It can be treated as a Stieltjes integral if desired. S(x) dx is replaced by duQ r), where u(x) is the 
Heaviside step function. 
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This distribution S (pc) is only one of an infinity of possible distributions, but 
it is the one we are interested in because of Eq. (1.151). 

From these sequences of functions, we see that Dirac’s delta function must 
be even in x, 8{—x) = 8 (pc). 

Let us now consider a detailed application of the Dirac delta function to a 
single charge and illustrate the singularity of the electric field at the origin. 


EXAMPLE 1.14.2 


Total Charge inside a Sphere Consider the total electric flux <f E • da out 
of a sphere of radius R around the origin surrounding n charges e :j located at 
the points Yj with r ? - < R (i.e., inside the sphere). The electric field strength 
E = — Vy(r), where the potential 


<p = V 6j = f P(r0 
jr[ \r-Yj\ J Ir-r'l 


d 3 r' 


is the sum of the Coulomb potentials generated by each charge and the total 
charge density is p (r) j ej8(r — r )). The delta function is used here as an 
abbreviation of a pointlike density. Now we use Gauss’s theorem for 


<j) E • da = — (j) "V(p ■ da = — J W 2 <pdx = J 


2 ipdr = / £« 
£0 


dr 


£o 


in conjunction with the differential form of Gauss’s law V • E = —p/s o and 

£ j e j 1 8 0 - r :/) dr = £ j e r■ ■ 


The integral property [Eq. (1.151)] is useful in cases in which the argument 
of the delta function is a function g(x) with simple zeros on the real axis, 
which leads to the rules 


8 (ax) = 

-<50*0, 

a 

a > 0, 

(1.161) 

<5(00*0) : 

- E 

8(x— a) 

(1.162) 



g(a)=0, 

S'(o)=^0 




To obtain Eq. (1.161) we change the integration variable in 


f 


= - I 

a 


f{x)8(ax)dx = - I ft - )8(y)dy= -/(0) 


and apply Eq. (1.151). To prove Eq. (1.162), we decompose the integral 

/ oo _ ra+e 

f{pc)8(jg(x')')dx= Y, / /(a()<5(0r— a^g'/a^dx (1.163) 

-OO a Ja—E 


into a sum of integrals over small intervals containing the first-order zeros 
of g(x). In these intervals, g{x ) ~ g(a) + (x — a)g'(a) = (x — a)g'(a). Using 
Eq. (1.161) on the right-hand side of Eq. (1.163), we obtain the integral of 
Eq. (1.162). 
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EXAMPLE 1.14.3 


EXAMPLE 1.14.4 


Evaluate / = /(x)<5(x 2 — 2)dx Because the zeros of the argument of 

the delta function, x 2 — 2, are x = ± >/2, we can write the integral as a sum of 
two contributions: 


-L 

-L 


' /S+f AU f(PC)dX 

8(x~ v 2) , - dx- 


m c dC^-2), 

V2_e dx l*=V2 


/_ 


^JCx+VZ)- 


/p f 6^-2), 

V2_c dx l*=-V2 


f(x)dx 


V2+e 

S(ar—V2)- 
V2-, 2V2 

/(V2) + /(-V2) 

2V2 


L 


V 2+6 

5(a;+V2)- 

-V2-« 2V2 


f(x)dx 


This example is good training for the following one. 


Phase Space In the scattering theory of relativistic particles using Feynman 
diagrams, we encounter the following integral over energy of the scattered 
particle (we set the velocity of light c — 1): 



- m 2 )/(p) = 


J d 3 p J dp 0 8(p 

r d 3 pf{E , p) 

Je> 0 2-/to 2 + p 2 



m 2 )/(p) 
d 3 pf{E, p) 

2y/m 2 + p 2 ’ 


where we have used Eq. (1.162) at the zeros E = ± s /m 2 + p 2 of the argument 
of the delta function. The physical meaning of 8 (p 2 — to 2 ) is that the particle of 
mass to and four-momentum jf = (p 0 , p) is on its mass shell because p 2 = m 2 
is equivalent to E = ± ■J'rri 2 + p- . Thus, the on-nrass-shell volume element in 
momentum space is the Lorentz invariant ^, in contrast to the nonrelativistic 
d 3 p of momentum space. The fact that a negative energy occurs is a peculiarity 
of relativistic kinematics that is related to the antiparticle. ■ 

Using integration by parts we can also define the derivative 8’(pc) of the 
Dirac delta function by the relation 

/ CO /»oo 

f(x)8'(x — oc')dx= — f'(x)8(x — x')dx = — /'(a/). (1.164) 

-CO J — CO 


It should be understood that our Dirac delta function has significance only 
as part of an integrand. Thus, the Dirac delta function is often regarded as a 
linear operator: 8 (x — xo) operates on f(x) and yields f(x 0 ): 

/ OO 

8(x- xo)f (x) dx = f(x 0 ). (1.165) 

-OO 
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SUMMARY 


It may also be classified as a linear mapping or simply as a generalized function. 
Shifting our singularity to the point x = x', we write the Dirac delta function 
as S(x — x ')• Equation (1.151) becomes 



f(x)S (x — x') dx = f (a/). 


(1.166) 


As a description of a singularity at x = x’, the Dirac delta function may be 
written as 8 (x — x r ) or as Six' — x). Expanding to three dimensions and using 
spherical polar coordinates, we obtain 

(*2tt pn /* oo 


/( 0 ) = 


/> Z7T /»7T n 

J 0 Jo Jo 

-///: 


f (r)<5 (r) r" dr sin 0 dO dtp 


fix, y, z)8 Or) S ( y ) 8 ( z ) dxdydz, 


ir^rC 


/>2 n 

^ffr^dr I 5(cos0)dcos0 / 8iq>)d(p = 1, (1.167) 

r 2 J -1 Jo 

where each one-dimensional integral is equal to unity. This corresponds to a 
singularity (or source) at the origin. Again, if our source is at r = iq, Eq. (1.167) 
generalizes to 


/// 


/(r 2 )<5 (r 2 - ri )ifdr 2 sind 2 dd 2 dcp 2 = /(it), 


(1.168) 


where 


r 

Jo 


S(r 2 - n) 2 , 
- 2 - r^dr 2 




8 (cos 0 2 — cos Of)d cos d 2 K<P2 ~ <Pi)d<p 2 = 1 . 


L 1 


We use 8 (x ) frequently and call it the Dirac delta function—for historical 
reasons. 19 Remember that it is not really a function. It is essentially a short¬ 
hand notation, defined implicitly as the limit of integrals in a sequence, 8 n (x), 
according to Eq. (1.160). 


Biographical Data 

Dirac, Paul Adrien Maurice. Dirac, an English physicist, was born in 
Bristol in 1902 and died in Bristol in 1984. He obtained a degree in electrical 
engineering at Bristol and obtained his Ph.D. in mathematical physics in 1926 
at Cambridge. By 1932, he was Lucasian professor, like Stokes, the chair 
Newton once held. In the 1920s, he advanced quantum mechanics, became 
one of the founders of quantum field theory, and, in 1928, discovered his 
relativistic equation for the electron that predicted antiparticles for which 
he was awarded the Nobel prize in 1933. 


19 Dirac introduced the delta function to quantum mechanics. Actually, the delta function can 
be traced back to Kirchhoff, 1882. For further details, see M. Jammer (1966). The Conceptual 
Development of Quantum Mechanics , p. 301. McGraw-Hill, New York. 
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EXERCISES 
1.14.1 Let 


8 n ( x ) = 


°. x< ~k 

n ’ ~ 2n < X < 2m’ 


0, 


2n < X ' 


Show that 


/ OO 

f(x)& n (x)dx= /(0), 

-OO 

assuming that f{pc) is continuous at x = 0. 

1.14.2 Verify that the sequence based on the function 

0, x < 0, 


8n(x) — 


ne 


x > 0, 


is a delta sequence [satisfying Eq. (1.159)]. Note that the singularity is 
at +0, the positive side of the origin. 

Hint. Replace the upper limit (oo) by c/n, where c is large but finite, 
and use the mean value theorem of integral calculus. 


1.14.3 For 


Snix) 


n 


1 


1 + re 2 a? 2 ’ 


[Eq. (1.155)], show that 


f 


S n ix)dx — 1. 


1.14.4 Demonstrate that S n = sinnx/nx is a delta distribution by showing 
that 


/ °° sin nx 

fix) - dx = fi 0). 

■00 


7 XX 


Assume that / ix) is continuous at x = 0 and vanishes as x -* ±oo. 
Hint. Replace x by y/n and take lira to —* oo before integrating. 

1.14.5 Fejer’s method of summing series is associated with the function 


&n it) — „ 

2nn 


sin(nf/2)’ 


sin(f/2) 

Show that d n (t) is a delta distribution in the sense that 


1 r 

lim -— / fit) 

n-^oo tiTtn J_ c 


sin(wt/2)' 


dt = fi 0). 


sin(f/2) 

1.14.6 Using the Gaussian delta sequence (<5„)> Eq. (1.154), show that 

d 

x—S ix) = —8 ix ), 
dx 

treating 8 ix) and its derivative as in Eq. (1.151). 













94 


Chapter 1 Vector Analysis 


1.14.7 Show that 



8'(x)f(x)dx — — /'(O). 


Assume that f'(x) is continuous at x = 0. 


1.14.8 Prove that 


K/m = 


df(x ) 
dx 


-l 

8(x — Xq), 


where x G is chosen so that f(x 0 ) = 0 with df/dx ^ 0; that is, / (x) 
has a simple zero at Xq. 

Hint. Use <5 (/) df = 8 (x) dx after explaining why this holds. 


1.14.9 Show that in spherical polar coordinates (r, cos 0, <p) the delta function 
<5(ri — r 2 ) becomes 

4 K r i ~ r 2 )<5(cos 0\ - cos6> 2 )(5(^i - <p 2 ). 
rf 

1.14.10 For the finite interval (—jt, it ) expand the Dirac delta function <5 (x— t) 
in a series of sines and cosines: sin nx, cos nx, n = 0, 1, 2,.... Note 
that although these functions are orthogonal, they are not normalized 
to unity. 

1.14.11 In the interval (— n, jt), 8 n (pc) = -J= exp(—n 2 .:r 2 ). 

(a) Expand 8 n {pc ) as a Fourier cosine series. 

(b) Show that your Fourier series agrees with a Fourier expansion of 
8(x) in the limit as n —> oo. 

(c) Confirm the delta function nature of your Fourier series by show¬ 
ing that for any f(x) that is finite in the interval [—Jt, jt] and con¬ 
tinuous at x = 0, 


/’ 


/(.^[Fourier expansion of 8 x (x)] dx = /( 0). 


1.14.12 (a) Expand 8 n (x) = -J= cxp(—n 2 x 2 ) in the interval (—oo, oo) as a 
Fourier integral. 

(b) Expand 8 n (x) = ncxp(—'nx) as a Laplace transform. 


1.14.13 We may define a sequence 


8 n {x~) = 


n, 

0 , 


\x\ < 1/2 n, 
\x\ > 1/2 n. 


[Eq. (1.153)]. Express S n (x) as a Fourier integral (via the Fourier in¬ 
tegral theorem, inverse transform, etc.). Finally, show that we may 
write 

1 r°° 

8 (pc) = lim 8 n (x) = — / e~ lkx dk. 
n^oo 2 j r J_ 00 







1.14 Dirac Delta Function 


95 


1.14.14 Using the sequence 



show that 



e lkx dk. 


Note. Remember that S (x) is defined in terms of its behavior as part 
of an integrand, especially Eq. (1.159). 

1.14.15 Derive sine and cosine representations of S(t — x). 
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Chapter 2 


Vector Analysis in 
Curved Coordinates 
and Tensors 


In Chapter 1 we restricted ourselves almost completely to rectangular or 
Cartesian coordinate systems. A Cartesian coordinate system offers the unique 
advantage that all three unit vectors, x, y, and z are constant in direction as 
well as in magnitude. We did introduce the radial distance r, but even this 
was treated as a function of x, y, and z, as r = y/x 2 + y 2 + z 2 . Unfortunately, 
not all physical problems are well adapted to solution in Cartesian coordi¬ 
nates. For instance, if we have a central force problem, F = r F(r), such as 
gravitational or electrostatic force, Cartesian coordinates may be unusually in¬ 
appropriate. Such a problem requires the use of a coordinate system in which 
the radial distance is taken to be one of the coordinates, that is, spherical polar 
coordinates. 

The point is that the coordinate system should be chosen to fit the problem 
to exploit any constraint or symmetry present in it. For example, rectangular 
(spherical) boundary conditions call for Cartesian (polar) coordinates, or a 
rectangular (cylindrical) shape of the system demands Cartesian (cylindrical) 
coordinates. With such a choice, it is hoped that the problem will be more 
readily soluble than if we had forced it into an inappropriate framework. 

Naturally, there is a price that must be paid for the use of a non-Cartesian 
coordinate system. In Chapter 1 we developed the gradient, divergence, and 
curl in Cartesian coordinates, but we have not yet written expressions for 
gradient, divergence, or curl in any of the non-Cartesian coordinate systems. 
Such expressions are developed first for cylindrical coordinates in Section 2.2 
and then developed in a general form in Section 2.3. This system of curvi¬ 
linear coordinates is then specialized to spherical polar coordinates in 
Section 2.5. There are other useful coordinates, 11 of which can be found 
in the second edition of Mathematical Methods and some in Margenau and 
Murphy (1956). 
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As mentioned previously, there are 11 coordinate systems in which the three- 
dimensional Helmholtz partial differential equation can be separated into three 
ordinary differential equations. Some of these coordinate systems have 
achieved prominence in the historical development of quantum mechanics. 
Other systems, such as bipolar coordinates, satisfy special needs. Partly be¬ 
cause the needs are rather infrequent, but mostly because the development of 
computers and efficient programming techniques reduce the need for these 
coordinate systems, the discussion in this chapter is limited to (i) a brief sum¬ 
mary of Cartesian coordinates dealt with extensively in Chapter 1, (ii) circular 
cylindrical coordinates, and (iii) spherical polar coordinates. Specifications 
and details of the other coordinate systems can be found in the first two edi¬ 
tions of this work and in Morse and Feshbach (1953) and Margenau and Murphy 
(1956). 


P Rectangular Cartesian Coordinates 


In Cartesian coordinates we deal with three mutually perpendicular families of 
planes: x — constant, y = constant, and z = constant. These are the Cartesian 
coordinates on which Chapter 1 is based. In this simplest of all systems, the 
coordinate vector and a vector V are written as 


r = ±x + yy + zz, V = xV x + yV v + zV z . 


( 2 . 1 ) 


The coordinate unit vectors x, y, z, are constant in direction and length, and 
they are mutually orthogonal, making Cartesian coordinates the simplest for 
developing vector analysis. 

Now we turn our attention to line, area, and volume elements in order to 
perform multiple integrals and differentiations. From the Pythagorean theorem 
in Cartesian coordinates, the square of the distance between two infinitesimally 
close points is 


ds 2 = dx 2 + dy 2 + dz 2 


( 2 . 2 ) 


where the sum of squares means that these coordinates are called orthogonal. 
That is, no bilinear terms dxdy, dydz, dxdz occur. 

From Eqs. (1.63), (1.71), and (1.78) we reproduce the main results of 
vector analysis in Chapter 1: 
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V • Vf = 


d 2 ij/ 
dx 2 + 


3 2 i/r d 2 f 

3 y 2 3s 2 ’ 


x y z 

3 3 3 

dx 3 y dz 

V x V v V z 


(2.5) 


( 2 . 6 ) 


Integrals in Cartesian Coordinates 


The simplest integrals are one-dimensional scalars such as the length of a 
space curve 


s = 



+ y 2 + z 2 dt, 


if the curve is parameterized as r(t) = (x(t) } y(t ), z(t j). 

The most common line integrals are of the form (see Examples 1.9.1 and 
1.9.2) 


f A ■ dr 


j* iA x dx T Aydy T A z 



(A x x + A y y + A z z)dt, 


thus reducing them to a sum of ordinary integrals in Cartesian coordinates. 

From the general formula discussed before Example 1.9.5 and employed 
in it, the area A of a surface z = z(x, y) is given by 


A = 


/ 


dxdy 

n z 


1 

n z 



Here the integration ranges over the projection of the surface onto the xy- 
plane. 

The volume bounded by a surface z—z(x, y) is given by V = ff z(x, y) 
dxdy , which is the three-dimensional generalization of the two-dimensional 
area A = f f{pc)dx under a curve y = fix). 

For examples of line, surface, and volume integrals in rectangular coordi¬ 
nates, refer to Chapter 1. 


2.2 Circular Cylinder Coordinates 


In the circular cylindrical coordinate system the three curvilinear coordinates 
are (p, ip, z). The limits on p, <p and s are 

0 < p < oo, 0 < (p < 2: r, and — oo < z < oo, 

and <p is not well defined for p = 0. Note that we are using p for the perpen¬ 
dicular distance from the 2 -axis and saving r for the distance from the origin. 
The ^-coordinate remains unchanged. This is essentially a two-dimensional 
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Figure 2.1 

Circular Cylinder 
Coordinates 



curvilinear system with a Cartesian 2 -axis added on to form a three-dimensional 
system. The coordinate surfaces, shown in Fig. 2.1, are 


1. Right circular cylinders having the 2 -axis as a common axis, 

p = ( x 2 + t/ 2 ) 1/2 = constant. 


2. Half planes through the 2 -axis, 


p — tan 1 



= constant. 


\x j 

3. Planes parallel to the .xy-plane, as in the Cartesian system, 

2 = constant. 

Inverting the preceding equations for p and p (or going directly to Fig. 2.2), 
we obtain the transformation relations 


x=pcoscp, y—p sirup, z — z. (2.7) 

The coordinate unit vectors are p, ip, z (Fig. 2.2). Note that p is not well defined 
at p = 0. The unit vector p is normal to the cylindrical surface pointing in 
the direction of increasing radius p. The unit vector p is tangential to the 
cylindrical surface, perpendicular to the half plane <p — constant, and pointing 
in the direction of increasing azimuth angle <p. The third unit vector, z, is the 
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Figure 2.2 

Circular Cylindrical 
Coordinate Unit 
Vectors 



usual Cartesian unit vector. They are mutually orthogonal, that is, 

A A A A A /V A 

pip = tpz = zp = i). 

For example, in circular cylindrical coordinates we have 

V = pp + zz, V = pV p + CpV^ + zV z . (2.8) 


EXAMPLE 2.2.1 


Distance in Cylindrical Coordinates We now derive the distance ds 2 in 
circular cylindrical coordinates p,<p,z with x—p cos <p, y = p sin <p of plane 
polar coordinates. Differentiating we obtain 


dx = cos ipdp — p sin ipdip, dz = dz, 


using 


dx 

Yp 


3 (p cos i p) 

Yp 


dx 

COS <p, — 

dip 


—p sin<p, 



Similarly, we find 


dy = sin ipdp + p cos <pdip 


using 


dy 

dp 


d (p sin^) 

Yp 


= sin^, 


and, upon squaring and adding, 


dy 

— = p cosip 
dip 


ds 2 = dx 2 + dy 2 = dp 2 + p 2 dip 2 + dz 2 


(2.9) 
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Figure 2.3 

(a) Line Elements of 
Polar Coordinates. 

(b) Infinitesimal 
Triangle on a Space 
Curve 



becomes a sum of squares, again showing that these coordinates are orthog¬ 
onal. That is, the terms dpdcp, dpdz, dipdz drop out, and the straight lines 
through the origin with z = 0, (p — const., are perpendicular to the circles 
z — 0, p — const., wherever they intersect; they are orthogonal to the z-axis 
as well. The geometry is illustrated in Fig. 2.3. ■ 


Integrals in Cylindrical Coordinates 


To develop the length of a space curve, we start from the line element ds 2 — 
dp 2 + p 2 dcp 2 + dz 2 from Example 2.2.1 and divide it by dt 2 to obtain s 2 = 
p 2 + p 2 ip 2 + z 2 , in case the curve is parameterized as r(£) = (p(t '), cp(t ), z(t)). 
Using the chain rule, the length of a space curve becomes 


s = 



+ p 2 (p 2 + z 2 dt 


( 2 . 10 ) 


and is a scalar integral. For example, for a sector of a circle of radius R in 
Fig. 2.4, the line element reduces to p\ p= Rdq> — Rdxp because dp = 0 = dz. 
Therefore, we have R /Jj 2 dtp — R((fi 2 — <p\) for the length of arc. 

The line integral is a sum of ordinary integrals if we expand the dot product 
as 

f A - dr = J ( Apdp + pA^dcp + A z dz ) 

= J (A p p + pA^ip + A z z) dt (2.11) 

using the chain rule. 

When A is the vector potential of the magnetic field B = V x A, then 
Stokes’s theorem relates the line integral 


A - dr = 


f (V x A) • 
Js 


J 


da = I B • da 
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to the magnetic flux through the surface S bounded by the closed curve, that 
is, a surface integral. 


EXAMPLE 2.2.2 


Magnetic Flux For a constant magnetic field B = B z in the ^-direction, the 
flux through a circle of radius R around the origin of the ry-plane is obviously 
given by Btt R 2 . Let us see what <f A ■ dr gives. From Example 1.7.1, we take 
A = g(B x r), with r = Rr but dr = p\ /t= i->d<ptp = RdupCp , the same as for the 
arc section in Fig. 2.4 because dp — 0 = dz. Hence, 


B x v\ p=R = 


P V z 
0 0 5 

p 0 0 


= BR<p>, 


p=R 


so that, with p = R, 


(f) (Bxr)-dr = \bR 2 f dtp = BttR 2 . 
Jp=R 2 J o 


1 

2 
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EXAMPLE 2.2.3 


Work Integral As an illustration of a line integral, we want to find the work 
done by the angular momentum type force of Example 1.9.2: F = —x sin ip + 
y cos ip around a pielike sector shown in Fig. 2.5 from the origin to x — R, then 
at constant radial distance R from azimuthal angle <p = 0 to 7r/4 and finally 
back radially to the origin. 

The geometry of the path tells us to use plane polar coordinates (x — 
p cos ip,y=p sin <p). In vector notation, we have 


p=pp = (x,y), p = xcos<p + ysirnp. 


However, the work in these coordinates is / F • dp, where F still needs to be 
expressed as a linear combination of p and Cp like the path element 


with 


dp = dpp + pd(p 


dp 

dip 


dp 

— = —xsmffl + y cos ip, 
dip 


that follows from differentiating p = x cos ip + y sin ip with respect to ip. Next, 
let us express ip in Cartesian unit vectors. 

Differentiating 1 = p 2 we obtain 0 = 2p • The vanishing dot product 
means that ^ is perpendicular to p, just like Cp. Because —x sin ip + y cos ip is 
already a unit vector, it must be equal to ±Cp. Geometry tells us that the + sign 
is correct. 

Applying this to the force, we notice F = Cp so that the radial integrals do 
not contribute to the work, only the arc does. Therefore, dropping the radial 
paths from dp, we only keep 

/.jr/ 4 /.jt/4 n 

W= Cp ■ Cppdip\ p=R —R \ dip = R—. 

Jo Jo 4 


EXAMPLE 2.2.4 


Magnetic Flux and Stokes’s Theorem When a stationary current I flows 
in a long circular coil of wire (Fig. 2.6), Oersted’s law <f H ■ dr = I and the ge¬ 
ometry tell us that the magnetic induction B = B z i is along the ^-direction (i.e., 
the coil axis). Moreover, cylindrical coordinates are appropriate to formulate 
Stokes’s theorem 


<t> 


I B • dcr 


J (V x A) ■ da = 


A • dr. 


The latter links the magnetic flux through one wire loop of the coil shown 
in Fig. 2.7 to the integral of the vector potential along the wire and A = A$Cp 
is nonvanishing only in the azimuthal direction. With 


da —ip dp dip, dr\ p=R — pdipCp\ p=R = RdipCp 
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Figure 2.6 

Magnetic Flux 
Through a Long Coil 



Figure 2.7 

Magnetic Flux 
Through Wire Loop 
with Current / 



we obtain the magnetic flux through the cross section of the coil as 

pR p2 tc p2.Tr 

<t> = / / B z pdpd(p= / A,ppd(p\p = R — 2jtRA^,, 

J p=0 J (p =0 J (p =0 

so that A ( p = . From Oersted’s law, we infer B z = ^ = const. ■ 


EXAMPLE 2.2.5 


Area Law for Planetary Motion First, we derive Kepler’s law in cylindrical 
coordinates, which states that the radius vector sweeps out equal areas in equal 
time from angular momentum conservation. 

We consider the sun at the origin as a source of the central gravitational 
force F = /(r)r. Then the orbital angular momentum L = mr x v of a planet 
of mass m and velocity v is conserved because the torque 


dL dr dr dp f(r) 

— = m — x- |-rx — = rxF = -r x r = 0. 

dt dt dt dt r 
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Hence, L = const. Now we can choose the 2 -axis to lie along the direction 
of the orbital angular momentum vector, L = Lz, and work in cylindrical 
coordinates r = (p, <p, z) = pp with z — 0 [compare with Eq. (2.8)]. The planet 
moves in the xy -plane because r and v are perpendicular to L. Thus, we expand 
its velocity as follows: 

dr .„ dp 

v = — = pp + p — = pp+ pcpip, 
dt dt 

and with 

p = (cos ip, sin </>), — = (—sinip, cos ip) = <2> 

dip 

find that ^ = </<£ using the chain rule. As a result, 

L = mp x v = mp(pip)(J) x (p) = mp 2 ipz = constant 

when we substitute the expansions of p and v in polar coordinates. 

The triangular area swept by the radius vector p in the time dt. (area law) 
is then given by 

A= \j = l J ^ Al = 2 / * = £’ (212) 

if we substitute mp 2 ip = L — const. Here, r is the period, that is, the time for 
one revolution of the planet in its orbit. 

Kepler’s first law states that the orbit is an ellipse. Now we derive the orbit 
equation p(<p) of the ellipse in polar coordinates, where in Fig. 2.8 the sun 
is at one focus, which is the origin of our cylindrical coordinates. From the 
geometrical construction of the ellipse we know that p'+p = 2 a, where a is the 
major half-axis; we shall show that this is equivalent to the conventional form 
of the ellipse equation. The distance between both foci is 0 < 2ae < 2a, where 
0 < e < 1 is the eccentricity of the ellipse. For a circle, e = 0 because both 
foci coincide with the center. At the angle ip = 3n/2, the distances p' = p = a 
are equal in Fig. 2.8, and the Pythagorean theorem applied to this rectangular 
triangle gives b 2 + a 2 e 2 = a 2 . As a result, Vl — e 2 = 6/a is given by the ratio 
of the minor (6) to the major half-axis a. 


Figure 2.8 

Ellipse in Polar 
Coordinates 
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Now we use the cosine theorem for the right angle triangle with the sides 
labeled by p', p, 2 ae in Fig. 2.8 and angle n — ip. Then, squaring the vector 
p' = p — 2 eax gives 

p' 2 = p 2 + 4a 2 e 2 + 4pae cos ip, 

and substituting p 1 = 2a — p, canceling p 2 on both sides, and dividing by 4a, 
yields 

p(l + e cos ip) = a(l — e 2 ) = p, (2.13) 

the Kepler orbit equation in polar coordinates. 

Alternatively, we revert to Cartesian coordinates to find from Eq. (2.13) 
with x = p cos ip that 

p 2 = x 2 + y 2 = Cp — xef — p 2 + x 2 a 2 — 2 pxe, 


so that the familiar ellipse equation in Cartesian coordinates 


(1-0 [x + 


pe 


+ r = p + 


p 2 e 2 


P ' 


obtains. If we compare this result with the standard form 

(x - xq) 2 y 2 _ 
a 2 b 2 ~ 


of the ellipse, we confirm that 


b = 



= aV 1 — e 2 , 


a = 


p 

1 — e 2 ’ 


and that the distance Xq between the center and focus is ae, as shown in Fig. 2.8. 

In Example 1.11.1, we derived the formula 2 A = f c (xdy — y (lx) for the 
area A of a simply connected region R enclosed by the simple curve C from 
Green’s theorem taking P = —y, Q — x in 


L 


. , dQ 9 P\ 

( Pdx+ Qdy) = / —-— dxdy. 


r V dx d V 


Applying this formula to the Kepler ellipse in the form x = a cos <p, y=b sin <p 
yields 


■/ 


2 A = ab / (cos“ q> + sim cp) d(p — 2nab 


for its area A. This concludes our Kepler orbit example. 

A closely related case is the area of the rosetta curve p = cos mp shown in 
Fig. 2.9 for the cases n— 1,2, 3, 4. As for the area law, we have 


1 [ 2n 

A — - cos 2 mp dip 
2 Jo 


il 


Inn 

. cos 2 udu = — 

2 n Jo 4 n 


/»2 nn 

fw Jo 


du = —, 
2 ’ 


independent of n. 
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Another interesting family of curves is given by p = 1 + e cos (p. Compare 
this with the form of the Kepler orbit equation. For e = 0 we have a unit circle 
and for e = 1 a cardioid (plot it). ■ 


Gradient 


The starting point for developing the gradient operator in curvilinear coordi¬ 
nates is the geometric interpretation of the gradient as the vector having the 
magnitude and direction of the maximum rate of spacial change of a function \j/ 
(compare Section 1.5). From this interpretation the component of Vi/''(p, <P, z) 
in the direction normal to the family of surfaces p = constant is given by 


P-ViA = — (2.14) 

dp 

because this is the rate of change of \j/ for varying p , holding <p and z fixed. The 
(^-component of the gradient in circular cylindrical coordinates has the form 

1 3iA 

£.ViA = Vf\ v = (2.15) 

p d q> 

because ds v , = pd(p is the angular line element [Eq. (2.9) and Fig. 2.3]. By 
repeating this for z and adding vectorially, we see that the gradient becomes 


3i lr 

ViKp,<P, z) = p— 
dp 


■¥> 


1 3i/f ^ 3i fr 


p dcp 


3 z 


(2.16) 


EXAMPLE 2.2.6 


Area Cut from Sphere by Cylinder Consider the unit sphere = p 2 + 
z 2 — 1 = 0 cut by the cylinder p = sin <p, which has radius 1 /2 and is parallel to 
the 2 -axis with center at y — 1/2 (Fig. 2.10). Let us calculate the area cut out 
by the cylinder. We want to apply the area formula of Example 1.9.5, which in 
cylindrical coordinates has the form 

f |V4>| 

A = go P dp dqi 
J ~3z 


if the surface is given as <J>(p, <p, z ) = 0. If the surface is defined by z = /(p, <p), 
then V* = (-|, -I|, 1) and |Vd>|2 = 1 + (|) 2 + (I|) 2 . 
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Figure 2.10 

Area Cut from Unit 
Sphere by a Cylinder 



In the x?y-plane the cylinder is given by the circle x 2 + (y — 1/2) 2 = 1/4, or 
p 2 = y, that is, p = sin tp as stated above. Also, VO = (2 p, 0, 2 z), = 2s so 

that |VO| 2 = 4(p 2 + s 2 ) = 4 and 

|VO| _ 1 

3 z 


Hence, the surface area cut out on the sphere becomes 


2 



sin<p 


pdpd(p 
'A - P 2 



cos ip) d(p 


- 1) = 7T - 2 

upon integrating over the semicircle in the first quadrant and multiplying by 2 
because of symmetry. If we cut out two such cylindrical windows, the remain¬ 
ing spherical surface is 2tt — 2(jr — 2) = 4, independent of tt. ■ 


= 20 — sin v 3 )lo /2 = ^ 


Divergence 


The divergence operator may be obtained from the second definition [Eq. 
(1.113)] of Chapter 1 or equivalently from Gauss’s theorem (Section 1.11). Let 
us use Eq. (1.113), 
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Figure 2.11 

Volume Element in 

Cylindrical 

Coordinates 



with a differential volume dr = p dp dip dz (Fig. 2.11). Note that the positive 
directions have been chosen so that (p, ip, z ) or (p, <p, z) form a right-handed 
set, p x <p = z. 

The area integral for the two faces p = constant in Fig. 2.11 is given by 

9 

dzdip — V p p dip dz = —( V„p)dp dip dz, (2.18) 

9p 

exactly as in Sections 1.6 and 1.9. 1 Here, V p is the component of V in the /re¬ 
direction, etc., increasing p; that is, V p = p ■ V is the projection of V onto the 
p-direction. Adding in the similar results for the other two pairs of surfaces, 
we obtain (for the differential volume dr = pdp dip dz) 


V P p+ ~(V p p)dp 


f 


V(p, ip, z)dcr = 


f(V P p)+^V, + f(V,p) 

dp dip dz 


dpdipdz. (2.19) 


Division by our differential volume dr yields 


V ■ V(p, ip,z) = - 


^(Vpp)+^-V <p + ^(V z p) 


( 2 . 20 ) 


p L3p dip * dz K 

We may obtain the Laplacian by combining Eqs. (2.16) and (2.20) using 
V = V \j/ (p , ip, z). This leads to 


V ■ ViKp, ip,z)= - 
P 


~P~ P 


di A 


l d^ 


dp \ dp ) dip \p dip J dz \ dz 


P 


dijr 


( 2 . 21 ) 


^ince we take the limit dp, dip, dz —*■ 0, the second- and higher order derivatives will drop out. 
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Finally, to develop V x V, let us apply Stokes’s theorem (Section 1.11) and, 
as with the divergence, take the limit as the surface area becomes vanishingly 
small. Working on one component at a time, we consider a differential surface 
element in the curvilinear surface p = constant. For such a small surface the 
mean value theorem of integral calculus tells us that an integral is given by the 
surface times the function at a mean value on the small surface. Thus, from 

V x V| p ■ dcrp = p ■ (V x V)p dip dz (2.22) 

Stokes’s theorem yields 


p ■ (V x V)p dip dz = 


V ■ dr, 


(2.23) 


with the line integral lying in the surface p = constant. Following the loop 
(1, 2, 3, 4) of Fig. 2.12, 


V(p, ip , z) • dr = V„p dip + 


V z + —(V^dip 
dip 


9 

V<pP + — (V^p)ds 
dz 


d d 

^--(pv^ 
dip dz 


dz 

dip — V z dz 
dip dz. (2.24) 


We pick up a positive sign when going in the positive direction on parts 1 
and 2, and a negative sign on parts 3 and 4, because here we are going in the 
negative direction. Higher order terms have been omitted. They will vanish in 
the limit as the surface becomes vanishingly small (dip —> 0, dz —> 0). 

Combining Eqs. (2.23) and (2.24) we obtain 


V x V| p = - 


9 9 

^V z --(pV^ 
_dip dz 


(2.25) 


Figure 2.12 

Surface Element 
with p — Constant 
in Cylindrical 
Coordinates 
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The remaining two components of V x V may be picked up by cyclic permu¬ 
tation of the indices. As in Chapter 1, it is often convenient to write the curl in 
determinant form: 


V x V = 


P PV z 

3 3 3 

3p 3 <p 3 z 
V p pV v V z 


(2.26) 


Remember that because of the presence of the differential operators, this 
determinant must be expanded from the top down (Eq. 3.11). Note that this 
equation is not identical with the form for the cross product of two vectors. 
V is not an ordinary vector; it is a vector operator. 

Our geometric interpretation of the gradient and the use of Gauss’s and 
Stokes’s theorems (or integral definitions of divergence and curl) have en¬ 
abled us to obtain these quantities without having to differentiate the 
unit vectors p, Cp. 


EXAMPLE 2.2.7 


Magnetic Induction of a Long Wire This problem involves the magnetic 
induction B and vector potential A generated by a long wire in the ^-direction 
carrying a stationary current I shown in Fig. 2.13. Oersted’s law <f H ■ dr = I 
and the geometry tell us that B has only an azimuthal component, and the 
vector potential has only a ^-component from the Biot-Savart law. With dr = 
pdupCp and B = poH we obtain B = Using B = V x A we verify that 

A = — In p because Eq. (2.26) gives 


V x (In pi) = — 
P 


P PV z 

3 3 3 

dp d<p 3 Z 

0 0 In p 


~<P- 


d\np 

dp 


<£_ 

P' 


Figure 2.13 

Magnetic Field of a 
Long Wire with 
Current I 



I 

H 
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An example of a line integral of the form f A x dr is a loop of wire C carrying a 
current I that is placed in a constant magnetic field B; then the force element 
is given by (IF = I dr x B so that the force is F = I f c dr x B. ■ 


EXERCISES 


2.2.1 Resolve the circular cylindrical unit vectors into their Cartesian com¬ 
ponents (Fig. 2.2). 


ANS. p = x cosip + ysin<p, 
p = —x sin p + y cos p, 

z = z. 


2.2.2 Resolve the Cartesian unit vectors into their circular cylindrical com¬ 
ponents (Fig. 2.2). 


ANS. x = p cos p — p sin p, 
y = p sin p + Cp cos <p, 

z = z. 


2.2.3 From the results of Exercise 2.2.1, show that 

9p _ , dp 

dp ^’ dp 

and that all other first derivatives of the circular cylindrical unit vectors 
with respect to the circular cylindrical coordinates vanish. 


~P 


2.2.4 Compare V ■ V [Eq. (2.20)] with the gradient operator 

„ A 3 ,19 ,9 

V = p -b P -b Z — 

W dp pdp dz 

[Eq. (2.16)] dotted into V. Explain why they differ. 


2.2.5 A rigid body is rotating about a fixed axis with a constant angular ve¬ 
locity uj. Take uj to lie along the 2 -axis. Express r in circular cylindrical 
coordinates and, using circular cylindrical coordinates, calculate 

(a) v = uj x r, (b) V x v. 

ANS. (a) v = pcop (b) V x v = 2&>. 

2.2.6 Halley’s comet has a period of about 76 years, and its closest distance 
from the sun is 9 x 10 7 km. What is its greatest distance from the sun? 


2.2.7 A planet is in a circular orbit about a star that explodes, shedding 2% 
of its mass in an expanding spherical shell. Find the eccentricity of the 
new orbit of the planet, which otherwise is not affected by the shell. 

2.2.8 Find the circular cylindrical components of the velocity and accelera¬ 
tion of a moving particle, 

v p = p, a p = p — pp 2 , 

v<p — pp, a v — pip + 2 pip, 

v z = z, a z — 3 . 
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Hint. 


r (0 = P(t)p(i) + Z2(£) 

= [xcos</>(£) + ysin^(t)]p(t) + zz(t). 

Note, p = dp/dt, p — d 2 p/dt 2 , and so on. 

2.2.9 In right circular cylindrical coordinates a particular vector function is 
given by 

V(p, <p) = pv p (p, (p') + pV^p, <p ). 

Show that V x V has only a 2 -component. 

2.2.10 The linear velocity of particles in a rigid body rotating with angular 
velocity m is given by 


v = (ppm. 


Integrate <f \ - dX around a circle in the .'ay-plane and verify that 


2.2.11 


v • dX 


V x v| z . 


area 

Two protons are moving toward each other. Describe their orbits if 
they approach (a) head-on and (b) on parallel lines a distance b (impact 
parameter) apart. 

Hint. Ignore the strong interaction but keep the Coulomb repulsion. 


2.2.12 A stationary current I flows in the wire loop A anticlockwise. If the 
loop A moves toward a parallel loop B of the same size and shape, in 
which direction will the induced current flow in loop B? 

2.2.13 A particle of mass m and charge e moves in a circular orbit in a mag¬ 
netic field B perpendicular to the orbit. Show that the time the particle 
takes for one orbit does not depend on its velocity. Is this still true if 
you change the direction of the magnetic field? 


2.3 Orthogonal Coordinates 


In Cartesian coordinates we deal with three mutually perpendicular families of 
planes: x — constant, y — constant, and 2 = constant. Imagine that we super¬ 
impose on this system three other families of surfaces q t (x, y, 2 ), i — 1,2, 3. 
The surfaces of any one family need not be parallel to each other and they 
need not be planes. If this is difficult to visualize, see Fig. 2.14, or the figure 
of a specific coordinate system such as Fig. 2.1 may be helpful. The three new 
families of surfaces need not be mutually perpendicular, but for simplicity we 
impose such a condition below [Eq. (2.33)]. This orthogonality has many ad¬ 
vantages: Locally perpendicular coordinates are almost like Cartesian coordi¬ 
nates, where areas and volumes are products of coordinate differentials, and 
two-dimensional motion may be separated into analogs of one-dimensional 
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Figure 2.14 

Curved Coordinates q , 
with Varying Directions 

4 



radial and angular components. A relevant example is the motion of a planet 
around a central star in plane polar coordinates (Example 2.2.5). In other 
words, we can again break vectors into components efficiently: (x, y) —»■ (p, <p), 
a powerful concept of physics and engineering. In this section, we develop the 
general formalism of orthogonal coordinates, derive from the geometry of 
orthogonal coordinates the coordinate differentials, and use them for 
line, area, and volume elements in multiple integrals. 

We may describe any point ( x , y, z) as the intersection of three planes in 
Cartesian coordinates or as the intersection of the three surfaces that form our 
new, curvilinear coordinates as sketched in Fig. 2.14. Describing the curvilinear 
coordinate surfaces by r/i = constant, q 2 = constant, q 3 = constant, we may 
identify our point by (gi, q 2 , q 3 ) as well as by (x, y, z ). This means that in 
principle we may write 


General curvilinear coordinates 
Qi, Q 2 , Q3 


Circular cylindrical coordinates 

P, V, z 


x = x(q u q 2 , q 3 ) 

y = y(au qz, 93 ) 
z = z(q h q 2 , q 3 ) 


— OO < X — p cos <p < 00 

—00 < y = p sirup < oo (2.27) 

— OO < z = z < oo 


specifying x, y, z in terms of the q’s and the inverse relations, 

0 < p — (x 2 + y 2 ) 1 ^ 2 < oo 
0 < <p — arctan (y/x) < 2 it 


qi = qi(x, y, z) 
q 2 = q 2 (x, y, z) 
Q3 = qsix, y, z) 


— OO < Z = Z < OO. 


(2.28) 
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As a specific illustration of the general, abstract q\, q 2 , q 3 , the transformation 
equations for circular cylindrical coordinates (Section 2.2) are included in 
Eqs. (2.27) and (2.28). With each family of surfaces q, — constant, we can 
associate a unit vector q, normal to the surface q r = constant and in the 
direction of increasing Because the normal to the q r — constant surfaces 
can point in different directions depending on the position in space (remember 
that these surfaces are not planes), the unit vectors q, can depend on the 
position in space, just like Cp in cylindrical coordinates. Then the coordinate 
vector and a vector V may be written as 


r = qi9i + q 2 q 2 + 4303, V = qiVj + q 2 V 2 + 4.3^3- (2.29) 


The q, are normalized to qf = 1 and form a right-handed coordinate system 
with volume 4i ■ (q 2 x q 3 ) = 1. 

This example tells us that we need to differentiate x(q \, q 2 , q 3 ) in Eq. (2.27), 
and this leads to (see total differential in Section 1.5) 

dx = p^dqi + p-dq 2 + pdq 3 , (2.30) 

dq\ dq 2 dq 3 

and similarly for differentiation of y and z, that is, dr = JT izpdxji. 

In curvilinear coordinate space the most general expression for the square 
of the distance element can be written as a quadratic form: 


ds 2 = g u dq\ + g l2 dq y dq 2 + g 13 dq i dq 3 
+ g- 2 i dq 2 dqi + g 22 dq% + g 23 dq 2 dq 3 
+ 031 dq 3 dq y + g 32 dq 3 dq 2 + g 33 dq\ 

= ^2g ij dq i dq j , (2.31) 

v 

where the mixed terms dqidq j with i ^ j, signal that these coordinates are not 
orthogonal. Spaces for which Eq. (2.31) is the definition of distance are called 
metric and Riemannian. Substituting Eq. (2.30) (squared) and the correspond¬ 
ing results for dy 2 and dz 2 into Eq. (2.2) and equating coefficients of dqi dqj, 2 
we find 

_ dx dx dy dy dz dz _ ^ dx t dx t 

9ij dqidqj dqidqj dqi dqj p dqi dqj ' 


These coefficients gij, which we now proceed to investigate, may be viewed as 
specifying the nature of the coordinate system (gi, q 2 , q 3 ). Collectively, these 
coefficients are referred to as the metric. 3 In general relativity the metric 
components are determined by the properties of matter, that is, the gij are 
solutions of Einstein’s nonlinear field equations that are driven by the energy- 
momentum tensor of matter: Geometry is merged with physics. 


2 The dq are arbitrary. For instance, setting dqo = dq 3 = 0 isolates f/n. Note that Eq. (2.32) can be 
derived from Eq. (2.30) more elegantly with the matrix notation of Chapter 3. 

:3 The tensor nature of the set of gij follows from the quotient rule (Section 2.8). Then the tensor 
transformation law yields Eq. (2.32). 
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From this point on, we limit ourselves to orthogonal coordinate systems 
(defined by mutually perpendicular surfaces or, equivalently, sums of squares 
in ds 2 ), 4 which means (Exercise 2.3.1) 

9 ij = 0, i ^ j, or q, • q j = S-y. (2.33) 

Now, to simplify the notation, we write g n = Ilf so that 

ds 2 = dqif + ( h 2 dq 2 f + (h 3 dq 3 f = ^(ft; dqtf. (2.34) 

i 

The specific orthogonal coordinate systems in Sections 2.2 and 2.5 are de¬ 
scribed by specifying scale factors h\, h 2 , and h 3 . Conversely, the scale factors 
may be conveniently identified by the relation 


dsi = hi dqi 


(2.35) 


for any given dqi, holding the other q’s constant. Note that the three curvilinear 
coordinates qi, q 2 , q 3 need not be lengths. The scale factors h; may depend 
on the q’s and they may have dimensions. The product h, dq, must have di¬ 
mensions of length and be positive. Because Eq. (2.32) can also be written as 
a scalar product of the tangent vectors 


3r 3r 

9ij =d^i'Wj’ 


(2.36) 


the orthogonality condition in Eq. (2.33) in conjunction with the sum of squares 
in Eq. (2.34) tell us that for each displacement along a coordinate axis (see 
Fig. 2.14) 


3r 

3 qi 


= hAi, 


(2.37) 


they are the coordinate tangent vectors so that the differential distance vector 
dr becomes 


dr = ^2 hi dqi q, = ^ hglq., . 


(2.38) 


Using the curvilinear component form we find that a line integral 
becomes 




Vihidqi. 


(2.39) 


The work dW = F ■ dr done by a force F along a line element dr is the most 
prominent example in physics for a line integral. In this context, we often use 
the chain rule in the form 


r<t: 
J rft) 


r C*2) pti 

A(r(t)) -dr = J t A(r(t)) • -dt. 


(2.40) 


4 In relativistic cosmology the nondiagonal elements of the metric g-ij are usually set equal to zero 
as a consequence of physical assumptions such as no rotation. 




2.3 Orthogonal Coordinates 


117 


EXAMPLE 2.3.1 


Energy Conservation for Conservative Force Using Eq. (2.40) for a force 
F = to§ in conjunction with Newton’s equation of motion for a particle of 
mass to allows us to integrate analytically the work 


/• r (M 

dr(t) j 

C dv 

TO f 

/ F ■ dr = 

/ F •- J -dt — m\ 


■ vdt — 

/rft) 

J tl dt J 

dt 

2 J 


dv 2 m 

- dt = v 

dt, 2 


21 *2 
h 


m « ^ 

= -^-[v 2 (t 2 ) 


v 2 (fy)] 


as the difference of kinetic energies. If the force derives from a potential as 
F = — VV, then we can integrate that line integral explicitly because it 
contains the gradient 


r r (f 2 ) r 

/ F • dr = - / 

J r (ty) J r( 


rfe) 

ft) 


VV(r) • dr = - V|Jg = 


-[V(r(f 2 )) - nr(fO)] 


and identify the work as minus the potential difference. Comparing both ex¬ 
pressions, we have energy conservation 


7YI 7YL n 

-^v 2 ^) + U(r @2)) = -^-v 2 (h) + U(r(h)) 


for a conservative force. The path independence of the work is discussed 
in detail in Section 1.12. Thus, in this case only the end points of the path r(f) 
matter. ■ 


In Cartesian coordinates the length of a space curve is given by / ds, with 
ds 2 = dx 2 + dy 2 + dz 2 . If a space curve in curved coordinates is parameterized 
as (qi(0, ( 12 (f ), we find its length by integrating the length element of 

Eq. (2.34) so that 


L = 



(2.41) 


using the chain rule [Eq. (2.40)]. From Eq. (2.35) we immediately develop the 

area and volume elements 


day = dSi dSj = hihj dqi dqj (2.42) 

1 

dr — ds\ ds-2 ds 3 = hih 2 h 3 dq\ dq 2 dq 3 . (2.43) 

From Eq. (2.42) an area element may be expanded: 

da — ds 2 ds 3 qi + ds 3 dsi q 2 + dsi ds 2 q 3 
= h 2 h 3 dq 2 dq 3 qi + h 3 h x dq 3 dq 3 q 2 


+ hih 2 dq 3 dq 2 q 3 . 


(2.44) 
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Thus, a surface integral becomes 


J V ■ dcr = 


I 


Vih 2 h 3 dq 2 dq 3 + 


I 


V 2 h 3 hi dq 3 dqi + 


f 


V 3 h\h 2 dqi dq 2 . 


(2.45) 


More examples of such line and surface integrals in cylindrical and spher¬ 
ical polar coordinates appear in Sections 2.2 and 2.5. 

In anticipation of the new forms of equations for vector calculus that 
appear in the next section, we emphasize that vector algebra is the same in 
orthogonal curvilinear coordinates as in Cartesian coordinates. Specifically, 
for the dot product 


A'B = ^ • 4 kB k = Y AiBk&ik = Y A i B >’ (2.46) 

i,k i,k i 

where the subscripts indicate curvilinear components. For the cross product 

4i 42 43 

Ai A 2 M 3 , (2.47) 

Bi B 2 B 3 

as in Eq. (1.40). 


A x B = Y A4 i x 4 kBk = 

i,k 


EXAMPLE 2.3.2 


Orbital Angular Momentum in Cylindrical Coordinates In circular 
cylindrical coordinates the orbital angular momentum takes the form [see 
Eq. (2.8)] for the coordinate vector r = p + z and Example 2.2.5 for the 
velocity v = pp + pipCp + zz 


P z 


L = r x p = to 


P 

P 


0 z 
pip z 


(2.48) 


Now let us take the mass to be 3 kg, the lever arm as 1 m in the radial direction 
of the .rvy-plane, and the velocity as 2 m/s in the ^-direction. Then we expect L 
to be in the ip direction and quantitatively 


L = 3 


p p z 
1 0 0 
0 0 2 


= 3p 


0 0 
0 2 


-3p 


1 0 
0 2 


3z 


1 0 
0 0 


= — 6<£mkg/s. (2.49) 


Previously, we specialized to locally rectangular coordinates that are adapted 
to special symmetries. Let us now briefly examine the more general case in 
which the coordinates are not necessarily orthogonal. Surface and volume 
elements are part of multiple integrals, which are common in physical applica¬ 
tions such as center of mass determinations and moments of inertia. Typically, 
we choose coordinates according to the symmetry of the particular problem. 
In Chapter 1 we used Gauss’s theorem to transform a volume integral into a 
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surface integral and Stokes’s theorem to transform a surface integral into a 
line integral. For orthogonal coordinates, the surface and volume elements 
are simply products of the line elements h{dqi [see Eqs. (2.42) and (2.43)]. 
For the general case, we use the geometric meaning of 3r/3 qt in Eq. (2.37) 
as tangent vectors. We start with the Cartesian surface element dxdy, which 
becomes an infinitesimal rectangle in the new coordinates q \, q 2 formed by the 
two incremental vectors 


dr! = r(qi + dq h q 2 ~) - r (g h q 2 ) = —dq h 

3®i 


dr 2 = r(qi, q 2 + dq 2 ~) - r (g h q 2 ) = —dq 2 , 

dq 2 


whose area is the ^component of their cross product, or 


dxdy = dri x dr 2 


-1 

3 y 

.dqi 

dq 2 

dX 

dx 


392 

Jw 

ay 

3qi 

a ® 


dx 3 y 

3 q 2 dqi 


dqi dq 2 


dqi dq 2 . 


(2.50) 


(2.51) 


The transformation coefficient in determinant form is called the Jacobian. 

Similarly, the volume element dxdydz becomes the triple scalar product 
of the three infinitesimal displacement vectors dri = dqi p- along the q; direc- 

oQi 

tions qi, which according to Section 1.4 takes on the form 


dxdydz= 


3a; 

dx 

dx 

dqi 

392 

393 

dy_ 

dy 

3y 

dqi 

392 

393 

dz 

dz 

dz 

dqi 

392 

393 


dq\ dq 2 dq 2 . 


(2.52) 


Here, the coefficient is also called the Jacobian, and so on in higher dimensions. 

For orthogonal coordinates the Jacobians simplify to products of the 
orthogonal vectors in Eq. (2.38). It follows that they are products of hf, for 
example, the volume Jacobian in Eq. (2.52) becomes 


hih 2 h 2 (sLi x q 2 ) • q 3 = hih 2 h 3 . 


EXAMPLE 2.3.3 


Jacobians for Polar Coordinates Let us illustrate the transformation of 
the Cartesian two-dimensional volume element dxdy to polar coordinates 
p, ip, with x = p cos <p, y — p sin <p. (See also Section 2.2.) Here, 


dx 

dx 




dp 

dip 

dp dip = 

COS (p 

—p sirup 

dy 

dp 

dy 

dip 

sin^ 

p cosy 


dxdy = 


dpdcp = p dp dip. (2.53) 
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SUMMARY 


Similarly, in spherical coordinates (see Section 2.5), we get from x = r sin 9 
cos tp, y = r sin 9 sin tp, z — r cos 9 the Jacobian 


J = 


dx 

dx 

dx 

dr 

dG 

dtp 

dy 

dy 

in 

dr 

dG 

dtp 

d Z 

dz 

dz 

dr 

dG 

dtp 

cos 9 

rcosO 

rcosO 


sin 6 cos tp r cos 9 cos <p 
sin 0 sin <p r cos 9 sin ip 
cos 9 —r sin 9 


cos (p —r sin 6 sin tp 
sin <p r sin 9 cos tp 


+ r sin 9 


= r 2 (cos 2 9 sin 9 + sin 3 O') = r 2 sin 0 


—r sin 9 sin tp 
r sin 9 cos tp 
0 

sin 9 cos ip —r sin 9 sin <p 
sin 9 sin tp r sin 0 cos tp 

(2.54) 


by expanding the determinant along the third line. Hence, the volume element 
becomes dxdydz — r 2 dr sin 9d9dtp. The volume integral can be written as 

/ U, z)dxdydz = / f(x(r, 9, tp), y(r, 9, tp), z(r, 9, tp))r 2 dr sin 9d9dtp. ■ 


We have developed the general formalism for vector analysis in curvilinear 
coordinates. For most applications, locally orthogonal coordinates can be 
chosen, for which surface and volume elements in multiple integrals are prod¬ 
ucts of line elements. For the general nonorthogonal case, Jacobian determi¬ 
nants apply. 


Biographical Data 

Jacobi, Carl Gustav Jacob. Jacobi, a German mathematician, was born 
in Potsdam, Prussia, in 1804 and died in Berlin in 1851. He obtained his 
Ph.D. in Berlin in 1824. Praised by Legendre for his work on elliptical func¬ 
tions, he became a professor at the University of Konigsberg in 1827 (East 
Prussia, which is now Russia). He also developed determinants and partial 
differential equations, among other contributions. 


EXERCISES 

2.3.1 Show that limiting our attention to orthogonal coordinate systems im¬ 
plies that fjjj = 0 for i / j [Eq. (2.33)]. 

Hint. Construct a triangle with sides ds\, d.sv, and r/.s.a. Equation (2.42) 
must hold regardless of whether = 0. Then compare ds 2 from Eq. 
(2.34) with a calculation using the law of cosines. Show that cos 0\2 = 

2.3.2 In the spherical polar coordinate system q\ = r, qo — 0, q.>, = tp. The 
transformation equations corresponding to Eq. (2.27) are 

x — r sin 9 cos tp, y = r sin 9 sin tp, z = rcos9. 

(a) Calculate the spherical polar coordinate scale factors: h r , h„, and h v ,. 

(b) Check your calculated scale factors by the relation dst = hi dq,,. 
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2.3.3 The u-, v-, 2 -coordinate system frequently used in electrostatics and in 
hydrodynamics is defined by 

xy = u, x 2 — y 2 = v, z = z. 


This u-, v-, 2 -system is orthogonal. 

(a) In words, describe briefly the nature of each of the three families of 
coordinate surfaces. 

(b) Sketch the system in the xy-planc showing the intersections of sur¬ 
faces of constant u and surfaces of constant v with the .:n/-plane 
(using graphical software if available). 

(c) Indicate the directions of the unit vector u and v in all four quadrants. 

(d) Is this u~, v-, 2 -system right-handed (u x v = +z) or left-handed 
(u x v = —z)? 

2.3.4 The elliptic cylindrical coordinate system consists of three families of 

surfaces: 


a 2 cosh 2 u a' 2 sinh 2 u ’ a 2 cos 2 v a 2 sin 2 v ’ 

Sketch the coordinate surfaces u = constant and v = constant as they 
intersect the first quadrant of the ,x?y-plane (using graphical software if 
available). Show the unit vectors u and v. The range of u is 0 < u < oo, 
and the range of v is 0 < v < 2 tt. 

Hint. It is easier to work with the square of each side of this equation. 

2.3.5 Determine the volume of an n-dimensional sphere of radius r. 

Hint. Use generalized polar coordinates. 


2.3.6 Minkowski space is defined as X\ — x, Xn = y, X 3 — z, and xq — ct. This 
is done so that the space-time interval ds 2 = dx { 2 — dxf — dx f — dx 2 
(c = velocity of light). Show that the metric in Minkowski space is 


/I 

0 

0 

0 ^ 

0 

-1 

0 

0 

0 

0 

-1 

0 

Vo 

0 

0 

-1/ 


We use Minkowski space in Section 4.4 for describing Lorentz transfor¬ 
mations. 


D 


2.4 Differential Vector Operators 


Gradient 


The starting point for developing the gradient, divergence, and curl operators 
in curvilinear coordinates is the geometric interpretation of the gradient as 
the vector having the magnitude and direction of the maximum space rate of 
change of a function x// (compare Section 1.5). From this interpretation the 
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component of Vi//(7/i, q 2 , (/:>,) in the direction normal to the family of surfaces 
qi — constant is given by 5 


4l ■ Vl/r = Vl/r|l 


dx/f 

3Sj 


1 dxjr 
3<7i 


(2.55) 


since this is the rate of change of i// for varying q \, holding r /2 and r/:> fixed. For 
example, from Example 2.2.1 [and h p = 1, h, p = p, h z = 1 from Eq. (2.9)] the 
^-component of the gradient in circular cylindrical coordinates has the form 
given by Eq. (2.15). The quantity dsi is a differential length in the direction of 
increasing qq [compare Eq. (2.35)]. In Section 2.3, we introduced a unit vector 
4i to indicate this direction. By repeating Eq. (2.55) for <72 and again for <73 and 
adding vectorially, the gradient becomes 


3l/r 3i/r dxfr 

ViKqt, Q 2 , 93 ) = qi:— + q2 7— + q3^— 

3Si 3s 2 0 S 3 

„ ldi/r ldi/r 1 dxj/ 

= qi -r- + qa t r— + <i3 7 - 

h Y 3 qi h 2 3 q 2 h 3 dq 3 

sr* 1 


(2.56) 


Exercise 2.2.4 offers a mathematical alternative independent of this physical 
interpretation of the gradient. Examples are given for cylindrical coordinates 
in Section 2.2 and spherical polar coordinates in Section 2.5. 


Divergence 


The divergence operator may be obtained from the second definition [Eq. 
(1.113)] of Chapter 1 or equivalently from Gauss’s theorem (Section 1.11). Let 
us use Eq. (1.113), 


[V-dcT 

V ■ Vfoi, q 2 , q3 ) = lim J , (2.57) 

/ dz^o J dr 

with a differential volume dr = h\h 2 h 3 dq 3 dq 2 dq 3 (Fig. 2.15). Note that the 
positive directions have been chosen so that (qq, q 2 , q 3 ) or (4i, q 2 , 4,3) form a 
right-handed set, qi x q 2 = 43. 

The area integral for the two faces q\ — constant in Fig. 2.15 is given by 


V\h 2 h 3 + 


3 

— (V\h 2 h 3 )dqi 
dq 1 


dq 2 dq 3 - Vyh 2 h 3 dq 2 dq 3 


= ^(Vih 2 h 3 )dq l dq 2 dq 3 , 
dqi 


(2.58) 


B Here, the use of tp to label a function is avoided because it is conventional to use this symbol to 
denote an azimuthal coordinate. 
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Figure 2.15 

Curvilinear Volume 
Element 


z 



as in Sections 1.6 and 1.9 . 6 Here, V,; is the component of V in the q,-direction, 
increasing 9 ,; that is, V, = q, • V is the projection of V onto the tp-direction. 
Adding in the similar results for the other two pairs of surfaces, we obtain 


J V(g b q 2 , q 3 ydcr = 


d 3 

- — (Vihih 3 ) + - —( 14 ^ 3 ^ 1 ) + 
dq 1 dq 2 


-— (V 3 hih 2 ) 
dq-s 


dqidq 2 dq 3 . 


(2.59) 


Division by our differential volume [see dr after Eq. (2.57)] yields 
1 


V • V(g 1, q 2 , 93) = 


h y h 2 ln 3 


3 3 3 

— (V 1 h 2 h 3 ')+ —(V 2 h 3 hi)+ —(Vshihy 
dqi dq 2 dq 3 


(2.60) 


Applications and examples of this general result will be given in the follow¬ 
ing section for a special coordinate system. We may obtain the Laplacian by 
combining Eqs. (2.56) and (2.60), using V = Vi//( 9 i, q 2 , 93 ). This leads to 


V ■ V 1 K 91 , 92, 93 ) = 


1 

\ 9 

fh 2 h 3 ayA 

+ 9 

fh 3 hi 9i fr\ 

hih 2 h 3 

|_39i 

l hi dqi) 

392 

\ h 2 dq 2 ) 


_d_ /h 3 h 2 9VA 
993 V h 3 dq 3 ) 


(2.61) 


Examples and numerical applications of the central Eqs. (2.56), (2.60), (2.61), 
and (2.66) are shown for cylindrical coordinates in Section 2.2 and for spherical 
polar coordinates in Section 2.5. 


6 Since we take the limit dqi, dq 2 , dq 3 —> 0, the second- and higher order derivatives will drop out. 
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Finally, to develop V x V, let us apply Stokes’s theorem (Section 1.11) and, 
as with divergence, take the limit as the surface area becomes vanishingly 
small. Working on one component at a time, we consider a differential surface 
element in the curvilinear surface qi = constant. For such a small surface the 
mean value theorem of integral calculus states that an integral is given by the 
surface times the function at a mean value on the small surface. Thus, from 

V x V|i • dcr ! = ^ • (V x V)h 2 h 3 dq 2 dq 3 , (2.62) 

Stokes’s theorem yields 

qi • (V x V)h 2 h 3 dq 2 dq 3 = (f>V ■ dr, (2.63) 


with the line integral lying in the surface qi = constant. Following the loop (1, 
2, 3, 4) of Fig. 2.16, 


V(gi, <72, 9:0 • dr = V>h 2 dq 2 


V 3 h 3 + -— (V 3 h 3 ) dq 2 
992 


^ 2^2 + -— (V2f l 2)dq 3 

3 93 


<M) - (h 2 V 2 ) 

L 392 993 


dq 3 

dq 2 - V 3 h 3 dq 3 
dq 2 dq 3 . (2.64) 


We pick up a positive sign when going in the positive direction on parts 1 
and 2 and a negative sign on parts 3 and 4 because here we are going in the 
negative direction. Higher order terms have been omitted. They will vanish in 
the limit as the surface becomes vanishingly small (dq 2 -» 0, dq 3 -> 0). 


Figure 2.16 

Curvilinear Surface 
Element with 
qi — Constant 
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Combining Eqs. (2.63) and (2.64), we obtain 


V x V|j = 


1 


h 2 h 3 |_9q 2 


-(faVa)- 


9 

9^3 


( h 2 V 2 ) 


(2.65) 


The remaining two components of V x V may be picked up by cyclic permu¬ 
tation of the indices. As in Chapter 1, it is often convenient to write the curl in 
determinant form: 



qi/ii 

0 .2^2 

Q3^3 

1 

9 

9 

9 

hih 2 h 3 

dqi 

9<?2 

dq 3 


fciVi 

h 2 V 2 

h 3 V 3 


( 2 . 66 ) 


Remember that because of the presence of the differential operators, this 
determinant must be expanded from the top down. Note that this equation is 
not identical to the form for the cross product of two vectors [Eq. (1.40)]. V 
is not an ordinary vector; it is a vector operator. 

Our geometric interpretation of the gradient and the use of Gauss’s and 
Stokes’s theorems (or integral definitions of divergence and curl) have enabled 
us to obtain these general formulas without having to differentiate the unit 
vectors q,. There exist alternate ways to determine grad, div, and curl based 
on direct differentiation of the q. One approach resolves the q, of a specific 
coordinate system into its Cartesian components (Exercises 2.2.1 and 2.5.1) 
and differentiates this Cartesian form (Exercises 2.4.3 and 2.5.2). The point 
is that the derivatives of the Cartesian x, y, and z vanish since x, y, and z 
are constant in direction as well as in magnitude. A second approach [L. J. 
Kijewski, Am. J. Phys. 33, 816 (1965)] starts from the equality of 9 2 r/9r/, dqj 
and d 2 r/dqj <)q; and develops the derivatives of q, in a general curvilinear form. 
Exercises 2.3.3 and 2.3.4 are based on this method. 


EXERCISES 


2.4.1 Develop arguments to show that ordinary dot and cross products (not 
involving V) in orthogonal curvilinear coordinates proceed as in Carte¬ 
sian coordinates with no involvement of scale factors. 


2.4.2 With 4i a unit vector in the direction of increasing gq, show that 


(a) 


V 4i = 


1 d(h 2 h 3 ) 

h 3 h 2 h 2 dqi 


(b)Vxqi 



1 9hi „ 1 dhi 

-q 3 - 

h 3 dq 3 h 2 dq 2 


Note that even though ip is a unit vector, its divergence and curl do not 


necessarily vanish (because it varies with position). 
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2 . 4.3 Start from Eq. (2.37) and show (i) that 4; • <1, = 1 leads to an expression 
for hi in agreement with Eq. (2.32) and (ii) derive 


and 


34i „ 1 3 hj 

dqj qj hidqi’ 


3 4i 

3 qi 




1 dhi 
hj dqj ' 


2 . 4.4 Derive 


Vi -jr 


„ I df 1 3i/f 1 3iA 

Qi t - + <12 — — + qs — — 
hi dqi h 2 dq-2 h 3 dq 3 


by direct application of Eq. (1.112), 


Vi// = lint 

fdz 


f x[r dcr 
f dr 


Hint. Evaluation of the surface integral will lead to terms such as 
{hih 2 h 3 Y l (_d/dqiMih 2 h 3 ). The results listed in Exercise 2.4.3 will be 
helpful. Cancellation of unwanted terms occurs when the contributions 
of all three pairs of surfaces are added together. 


2.5 Spherical Polar Coordinates 


Relabeling (ch, q 2 , q 3 ) as (r, 0, (p), we see that the spherical polar coordinate 
system consists of the following: 


1. Concentric spheres centered at the origin, 

r = (x 2 + y 2 + 2 2 ) 1/2 = const. 


2. Right circular cones centered on the s-(polar) axis, vertices at the origin, 

z 


6 — arccos 


( x 2 + y 2 + z 2 ) 1 / 2 

3. Half planes through the 2 -(polar) axis, 


= const. 


y 

(p = arctan — = const. 

x 

By our arbitrary choice of definitions of 6 (the polar angle) and ip (the azimuth 
angle), the 2 -axis is singled out for special treatment. The transformation equa¬ 
tions corresponding to Eq. (2.27) are 

x — rsind cos^, y — r sind sin^, z — r cos 6, (2.67) 


measuring 6 from the positive 2 -axis and (p in the xy -plane from the positive 
.r-axis. The ranges of values are 0 < r < oc, 0 < d < n, and 0 < cp < 2 tt. At 
r — 0, 0 and ip are undefined. The coordinate vector measured from the origin 
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is r = rr. From Eqs. (2.32) and (2.35), 

hi = h r = 1, 
h 2 =hg= r, 
h3 = h<p = rsind. 

This gives a (vectorial) line element 

dr — r dr- + dr dO + Cpr sin 6 dip 

so that 

ds 2 = dr ■ dr = dr 2 + r 2 d9 2 + r 2 sin 2 9dip 2 


( 2 . 68 ) 


because the unit vectors are mutually orthogonal. 

In this spherical coordinate system, the area element [for r = constant; 
compare Eqs. (2.42) and (2.44)] is 

dA — dog? = f 2 sin 9 d9 dip, (2.69) 

the light, unshaded area in Fig. 2.17. Integrating over the azimuth ip, the area 
element becomes a ring of width d9, 

dA — 2 tt r 2 sin 9 d9. (2.70) 


This form will appear repeatedly in problems in spherical polar coordinates 
with azimuthal symmetry, such as the scattering of an unpolarized beam of 
particles. By definition of solid radians or steradians, an element of solid angle 
dSl is given by 


(LA. 

dQ. = — — sin 9 d9 dip = \d(cos9)dip\. 


(2.71) 


Figure 2.17 

Spherical Polar 
Coordinate Area 
Elements 
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Figure 2.18 

Spherical Polar 
Coordinates 



Integrating over the entire spherical surface, we obtain 

J d£2 = 4?r. 

From Eq. (2.43) the volume element is 

dr = r 2 dr sin 6 dOdcp = r 2 dr dQ. (2.72) 

The spherical polar coordinate unit vectors are shown in Fig. 2.18. 

It must be emphasized that the unit vectors r, 0, and <p vary in direction 
as the angles 9 and ip vary. Specifically, the 6 and <p derivatives of these 
spherical polar coordinate unit vectors do not vanish (Exercise 2.5.2). When 
differentiating vectors in spherical polar (or in any non-Cartesian system) this 
variation of the unit vectors with position must not be neglected. In terms of 
the fixed-direction Cartesian unit vectors x, y and z, 

r = x sin 9 cos cp + y sin 6 sin <p + z cos 9, 

6 — x cos 9 cos <p + y cos 9 sin <p — z sin 9, (2.73) 

<p — — xsin<p + ycosi p. 

Note that Exercise 2.5.4 gives the inverse transformation. 

A given vector can now be expressed in a number of different (but equiva¬ 
lent) ways. For instance, the position vector r may be written 

r = rr = r(x + y + 

= xx + yy + zz 

= xr sin 9 cos <p + y r sin 9 sin cp + z r cos 9. 



(2.74) 
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Select the form that is most useful for your particular problem, that is, r for 
spherical polar coordinates, the second form for Cartesian coordinates, and 
the third when converting from Cartesian to spherical polar coordinates. 

From Section 2.4, relabeling the curvilinear coordinate unit vectors qi, <U, 
and q 3 as r, 0, and (p gives 


V7 / , a 19l A . . 1 9 ^ 

= r-h 0 -h (2>-, 

dr r 3 9 rsin0 d(p 


V -V = 


V ■ Vt/t = 


r 2 sin 0 


3 o 3 3V ' 

sm 0 — (r 2 V r ) + r —(sin 0 V g ) + r—- 
dr dd 3 cp _ 


3 


,9iA 


3r V dr 


3 


sin 0 — r — H-- sin 0 —— 


30 


dxj/ 


dd 


(2.75) 

, (2.76) 

1 3 2 i lr 

- \ , (2.77) 


sin0 3 cp 2 _ 


V x V = 


r r0 rsin0£> 

3 3 3 

3r 30 dtp 

V r rV g r sin0V„ 


(2.78) 


EXAMPLE 2.5.1 


V, V-, Vx for Central Force Using Eqs. (2.75)-(2.78), we can reproduce 
by inspection some of the results derived in Chapter 1 by laborious application 
of Cartesian coordinates. 

From Eq. (2.75) 


V/(r) = r—, 
dr 

yfti — Y’YVY''^' ^ 


(2.79) 


For example, for the Coulomb potential V = Ze 2 lr, the electric field is 
E = -VV/e=-ZeVi = ff. 

From Eq. (2.76) 


V ■ r/(r) = -f(r) + 

r dr 

V • r r n = (n+ 2)r m_1 . 


(2.80) 


For example, for r > 0 the charge density of the electric field of the Coulomb 
potential is p = V • E = ZeV ■ ^ = 0. 

From Eq. (2.77) 


V 2 f(r) 


2 dj 
r dr 


d 2 f 

dr 2 ’ 


(2.81) 


V 2 r” = n(n+ 1 y n ~ 2 , 


(2.82) 
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in contrast to the ordinary radial second derivative of r" involving n— 1 instead 
of n+ 1. 

Finally, from Eq. (2.78) 


V x r f(r) = 0. 


(2.83) 


Integrals in Spherical Polar Coordinates 

The length of a space curve 


s = 


f 

J1 1 


f 2 + r 2 6 2 + r 2 sin" dcp 2 dt 


is a scalar integral, in case the curve is parameterized as r(£) = (r(£), 0(t), <p(t)). 

The line integral is a sum of ordinary integrals if we expand the dot product 
as 


J A - dr = 


J (Afdr + rAgdO + r sin OA^dcp) 
J QArV + rAgb + r sin dA v cp~) dt. 


See Example 1.9.2 for a line integral in plane polar coordinates, or for an ellipse 
x = a cos cot, y=b sin cot see Example 2.2.5 and the rosetta curves. 


EXAMPLE 2.5.2 


Surface of Hemisphere We can write the area formula as 

/ /» 27T /» 7T/2 /» 7T/2 

hgh v d6d(p = I I r 2 sin Odddcp — 2nr 2 I sin Odd 
Jo Jo Jo 

= — 2nr 2 cos0|q / 2 = 2jrr 2 , 

as expected. ■ 


EXAMPLE 2.5.3 


Area Cut from Sphere by Cylinder We take up the unit sphere cut by a 
cylinder from Example 2.2.6 and Fig. 2.10 again, but now we use the spherical 
area element sin ddddcp. The cylinder is described (in the .ry-plane) by r = 
cos cp, which is the n= 1 case of the rosetta curves in Fig. 2.9 of Example 2.2.5, 
and z — cos d holds on the sphere. Inserting x 2 + y 2 = cos 2 cp and z = cos d into 
x 2 + y 2 + z 2 = 1 , we get cos ip — sind, implying cp — ±(| — 0) as integration 
limits for d. Hence, the area cut from the sphere is given by 


Ml 2 p-K/2-e rit/2 c n \ 

A= / sindddd(p — 2 I—— 0jsin0d0 

Je =o Jip=e-n /2 Jo \2 J 

m/2 

= 7t — 2 / 0 sinddd — n — 2[— d cosd + sin0]|Q /2 = n — 2, 

Jo 

upon integrating the second term by parts. This agrees with our previous result, 
although the intermediate steps are quite different. ■ 
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EXAMPLE 2.5.4 


EXAMPLE 2.5.5 


The volume bounded by a surface z = z(x, y) is given by V = ff z(x, y)dx dy, 
the three-dimensional generalization of the two-dimensional area A— f f(pc) dx 
under a curve y = fix'), which we apply in the next example. 


Volume of Ellipsoid The ellipsoid equation with half axes a, b, c is given by 


x 


b 2 



so that z(x, y) = c 



yP 

b 2 


The complicated square-root form of z demonstrates that Cartesian coordi¬ 
nates are inappropriate to calculate the volume of an ellipsoid. It is easier to 
parameterize the ellipsoid in spherical polar coordinates as 


x — asind cosy, y= bsinOsinp, z=ccos9. 

We check that the ellipsoid equation is satisfied by squaring the coordinates and 
adding them up. Next, we transform the volume integral to polar coordinates 
using the Jacobian 


dxdy = 


dx 

dx 

dO 

d(p 

dy 

ay 

dO 

d(p 


dOdcp = 


a cos 0 cos (p —a sin 6 sin <p 
b cos 0 sin <p b sin 6 cos <p 


dOdcp 


— abddd(p sin 6 cos 6, 


with the volume in the positive octant given by 

/ /» 7 T /2 /»71 /2 

z(x, y)dxdy = / / c cos 0ab sin 6 cos 0d0d(p 

J (p =0 J 0 =0 

JT C o Jt o -I JT 

= —abc / cos ddcosd = — a&c cos 0|L, se=o = —abc. 

2 Jo 6 6 

Multiplying by 8, we find the ellipsoid volume | it abc, which reduces to that 
of a sphere for a — b — c = R. ■ 


Surface of Rotation Ellipsoid The ellipsoid surface is a much more com¬ 
plicated function of its half axes. For simplicity, we specialize to a rotation 
ellipsoid with a = b, for which we use the general area formula employed in 
Example 1.9.5: 


J n z 



(2.84) 


The partial derivatives 

3z(x, y) c 2 x csin0cosip 3 z(x, y) c 2 y csindsin^ 

dx a 2 z acosO ’ dy b 2 z bcosd 


enter the area formula via 


1 

n z 




C ‘2 

, tan 2 9, 
a 2 
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where the ^-dependence drops out for a = b (only) so that the ellipsoidal area 
is given by 


(*2,7T pTC 

=a2 LL 

= f 2itasin9\/a 2 cos 2 9 + c 2 sin 2 9d9. 
Je =o 


sin 0 cos 9, 1 H—«tan 2 9d9 


(2.85) 


The last expression has a direct geometric interpretation shown in Fig. 2.19 as 
the surface area of a rotation ellipsoid A = f Znyds, with radius y = a sin 0 
and arc length 


ds = y/ ( dx ) 2 + ( dz ) 2 




with z = c cos 0, a result that we could have used from the start, of course. 
Instead, we have shown as a by-product that the general area formula Eq. (2.84) 
is consistent with that generated by rotating the ellipse x = a sin 9, z= c cos 0 
about the 2 -axis. The area integral [Eq. (2.85)] can be evaluated for c > a by 
substituting t = e cos 0 , dt — —e sin 9d9 in conjunction with the eccentricity 
e = yj (c 2 — a 2 )/c 2 to yield 


A = 4tt(ic 


f 

Jo 


cos 2 9 d cos 9 — 2jt 


ac 


f JT- 


t 2 dt 


ac i - 

= 2jr — [ty 1 — t 2 + arcsin t][[ = 27rac 


yr 1 


i 


e z — arcsm e 


This result goes over into the area 4 tt Ii 1 of a sphere in the limit a = c = R and 
e -* 0, where the term in brackets goes to 2 using l’Hopital’s rule for arcsin e/e. 

The substitution used for the area integral in the case c > a amounted to 
solving 


A = 



e 2 z 2 

-j-dz 

c 2 
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2 2 

for the rotating ellipse % + % • We now handle the case a > c using the 
same formula but involving e 2 = (a- 2 — c 2 )/c 2 defined with the opposite sign. 
Thus, 



where the ellipse equation implied = — fsr- We solve the integral by sub¬ 
stituting t = ze /c, which yields 

A = 4jt— f 71 + t 2 dt — 2jx — [ty/l + t 2 + ln(£ + y/l + f 2 )]|n 
€ Jo e 

= 2;rac pi + e 2 + - ln(e + 7l + e 2 ) • 

In the limit a = c = R, e-»0 we obtain 4 jtR 2 again. 

The general formula for the area of an ellipsoid with a / / c / a is 

even more complicated, which shows that solutions to elementary problems 
are sometimes surprisingly tricky. ■ 


EXERCISES 

2 . 5.1 Resolve the spherical polar unit vectors into their Cartesian compo¬ 
nents. 


ANS. r = x sin 6 cos <p + y sin 6 sin <p + z cos 0, 

6 = x cos 9 cos <p + y cos 9 sin ip — z sin 9, 
ip = -xsinip + ycos<p. 

2 . 5.2 (a) From the results of Exercise 2.5.1, calculate the partial derivatives 

of r, 6, and Cp with respect to r, 9, and <p. 

(b) With V given by 

,3 a 1 3 .1 3 

r- b 9 - b 7- 

dr r dO rsind d(p 

(greatest space rate of change), use the results of part (a) to calcu¬ 
late V ■ Vi//. This is an alternate derivation of the Laplacian. 

Note. The derivatives of the left-hand V operate on the unit vectors of 
the right-hand V before the unit vectors are dotted together. 

2.5.3 A rigid body is rotating about a fixed axis with a constant angular veloc¬ 
ity lli. Take uj to be along the 2 -axis. Using spherical polar coordinates, 
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(a) Calculate 


v = uj x r. 


(b) Calculate 


V x v. 

ANS. (a) v = dpcor sin 0 
(b) V x v = 2a;. 

2 . 5.4 Resolve the Cartesian unit vectors into their spherical polar compo¬ 
nents. 

x = r sin 0 cos (p + 6 cos 0 cos cp — (p sin cp, 
y = r sin 0 sin q> + 6 cos 6 sin <p + Cp cos <p, 
z = rcos0 — 0sin0. 


2 . 5.5 With A any vector 


A Vr = A. 


(a) Verify this result in Cartesian coordinates. 

(b) Verify this result using spherical polar coordinates [Equation (2.75) 
provides V]. 

2 . 5.6 Find the spherical coordinate components of the velocity and acceler¬ 
ation of a moving particle: 

Vr = r, 

Vg = re, 

v v = rsinOip, 

a,- —f — rd 2 — r sin 2 Oip 2 , 

ag =rd + 2 rd — rsind cos0^ 2 , 

a v = r sin Oip + 2r sin 0<p + 2r cos 00ip. 

Hint. 

r(Q = r(t)r(t) 

= [xsind(t)cos^(Q + ysin@(t)sin^(t) + z cosd(t)]r(t). 

Note. The dot in f means time derivative, r = dr/dt. The notation was 
originated by Newton. There are no 6 or Cp components for r. 

2 . 5.7 Aparticle mmoves in response to a central force according to Newton’s 
second law 


mi — r/(r). 

Show that r x r = c, a constant and that the geometric interpretation 
of this (angular momentum conservation) leads to Kepler’s second law, 
the area law. Also explain why the particle moves in a plane. 
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2 . 5.8 Express d/dx, 3/3 y, 3 / 3.2 in spherical polar coordinates. 


3 

3 y 
3 



3 1 3 



Hint. Equate V xyz and V r ^. 
2 . 5.9 From Exercise 2.5.8, show that 



3 

i— 


This is the quantum mechanical operator corresponding to the 
2 -component of orbital angular momentum. 

2 . 5.10 With the quantum mechanical orbital angular momentum operator de¬ 
fined as L = —i( r x V), show that 



These are the raising and lowering operators of Section 4.3. 

2 . 5.11 Verily that L x L = iL in spherical polar coordinates. L = —i( r x V), 
the quantum mechanical orbital angular momentum operator. 

Hint. Use spherical polar coordinates for L but Cartesian components 
for the cross product. 

2 . 5.12 (a) From Eq. (2.75), show that 



(b) Resolving 6 and (p into Cartesian components, determine L r , L y , 
and L z in terms of 6, <p, and their derivatives. 

(c) From L 2 = L 2 + L 2 y + L 2 , show that 




This latter identity is useful in relating orbital angular momentum and 
Legendre’s differential equation (Exercise 8.9.2). 


2 . 5.13 With L = —ir x V, verily the operator identities 

3 rxL 
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(b) rV 2 — VII 


r — ) = ?'V x L. 

dr) 

2.5.14 Show that the following three forms (spherical coordinates) of V 2 i/r(r) 
are equivalent: 


(a) 


1 d 

r 2 dr 


; dfjry 

dr 


1 d 2 d 2 \l/(r ) 

(b )--^(r)], (c) J 
r dr 2 dr 2 


2 di^O') 
r dr 


The second form is particularly convenient in establishing a correspon¬ 
dence between spherical polar and Cartesian descriptions of a problem. 
A generalization of this appears in Exercise 8.6.11. 


2.5.15 (a) Explain why V 2 in plane polar coordinates follows from V 2 in 
circular cylindrical coordinates with z = constant. 

(b) Explain why taking V 2 in spherical polar coordinates and restrict¬ 
ing 6 to n/2 does not lead to the plane polar form of V. 

Note. 


V 2 ( P) <p) = 


dp 2 


i d 

p dp 


p 2 d(p 2 


2.6 Tensor Analysis 


In Chapter 1, vectors were defined or represented in two equivalent ways: 
(i) geometrically by specifying magnitude and direction, as an arrow, and (ii) 
algebraically by specifying the components relative to Cartesian (orthogonal) 
coordinate axes. The second definition has been adequate for vector analysis 
so far. However, the definition of a vector as a quantity with magnitude and 
direction is not unique and therefore incomplete. We encounter quantities, 
such as elastic constants and index of refraction in anisotropic crystals, that 
have magnitude and depend on direction but that are not vectors. We seek 
a new definition of a vector using the coordinate vector r as a prototype. 
In this section, two more refined, sophisticated, and powerful definitions are 
presented that are central to the concept of a tensor generally. 

Tensors are important in many areas of physics, including general relativity 
and electrodynamics. Scalars and vectors (Sections 1.1-1.5) are special cases 
of tensors. A scalar is specified by one real number and is called a tensor of 
rank zero. In three-dimensional space, a vector is specified by 3 = 3 1 real 
numbers (e.g., its Cartesian components) and is called a tensor of rank one. 
We shall see that a tensor of rank n has 3™ components that transform in a 
definite manner. 7 

We start by defining a vector in three-dimensional Euclidean space in terms 
of the behavior of its three components under rotations of the coordinate axes. 
This transformation theory philosophy is of central importance for tensor 
analysis and groups of transformations (discussed in Chapter 4 as well) and 
conforms with the mathematician’s concepts of vector and vector (or linear) 


1 In JV-dimensional space a tensor of rank n has N 71 components. 
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space and the physicist’s notion that observables must not depend on the 
choice of coordinate frames. There is an important physical basis for such a 
philosophy: We describe our physical world by mathematics, but any physical 
predictions we make must be independent of our mathematical conventions 
such as a coordinate system with its arbitrary origin and directions of its axes. 


Rotation of Coordinate Axes 


When we assume that space is isotropic—that is, there is no preferred direc¬ 
tion or all directions are equivalent—then the physical system being analyzed 
or the relevant physical law cannot and must not depend on our choice or 
orientation of the coordinate axes. Thus, a quantity S that does not change 
under rotations of the coordinate system in three-dimensional space, S' = S, 
is a rotational invariant; it is labeled a scalar. Examples are the mass of a par¬ 
ticle, the scalar product of two vectors whose invariance A' B' = A B under 
rotations we show in the next subsection, and the volume spanned by three 
vectors. Similarly, a quantity whose components transform under rotations like 
those of the coordinates of a point is called a vector. The transformation of 
the components of the vector under a rotation of the coordinates preserves the 
vector as a geometric entity (such as an arrow in space), independent of the 
orientation of the reference frame. Now we return to the concept of vector r 
as a geometric object independent of the coordinate system. Let us examine 
r in two different systems, one rotated in relation to the other. For simplicity, 
we consider first the two-dimensional case. If the x-, ^-coordinates are ro¬ 
tated counterclockwise through an angle <p, keeping r fixed (i.e., the physical 
system is held fixed), we read off Fig. 2.20 the following relations between the 


Figure 2.20 

Rotation of Cartesian 
Coordinate Axes About 
the z-Axis 
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components resolved in the original system (unprimed) and those resolved in 
the new rotated system (primed): 


x' = xcoscp + ysirnp, 
y' = -xsintp + ycosq>. 


( 2 . 86 ) 


We saw in Section 1.1 that a vector could be represented by the Cartesian 
coordinates of a point; that is, the coordinates were proportional to the vector 
components. Hence, the components of a vector must transform under rota¬ 
tion as coordinates of a point (such as r). Therefore, whenever any pair of 
quantities A :r and A y in the .r?y-coordina1e system is transformed in (A! x , A' y ) 
by this rotation of the coordinate system with 

A' = A x cos ip + A v sin w 

v (2.87) 

A y = —A x sin (p +Ay cos q>, 


we define 8 A x and A y as the components of a vector A. Our vector is now 
defined in terms of the transformation of its components under rotation of 
the coordinate system. From a comparison of Eqs. (2.86) and (2.87), we can 
conclude that A x and A y transform in the same way as x and y, the components 
of the two-dimensional coordinate vector r. In other words, the vector A varies 
as r does under rotations; that is, it “covaries.” If a pair of numbers A x and A y 
do not show this form invariance (or covariance) when the coordinates are 
rotated, they are not the components of a vector. In this sense, the coordinates 
of a vector belong together. In contrast, from Fig. 2.20 we see that any single 
component, such as A x , of the vector A is not invariant under rotations, that 
is, changes its length in a rotated coordinate system. 

The vector components A x and A y satisfying the defining equations 
[Eq. (2.87)] associate a magnitude A and a direction with each point in space. 
The magnitude is a scalar quantity, invariant to the rotation of the coordi¬ 
nate system. The direction (relative to the unprimed system) is likewise in¬ 
variant to the rotation of the coordinate system (see Exercise 2.6.1). The re¬ 
sult is that the components of a vector may vary according to the rotation 
of the primed coordinate system. This is what Eq. (2.87) states. However, 
the variation with the angle is such that the components in the rotated co¬ 
ordinate system A x and A y define a vector with the same magnitude and 
the same direction as the vector defined by the components A x and A y rel¬ 
ative to the x-, ^-coordinate axes (compare Exercise 2.6.1). The components 
of A in a particular coordinate system constitute the representation of A 
in that coordinate system. Equation (2.87), the transformation relation, is a 
guarantee that the entity A is independent of the rotation of the coordinate 
system. 


8 The corresponding definition of a scalar quantity is S' = S, that is, invariant under rotation of the 
coordinates like the scalar product of two vectors x] ■ x( = xi - X2. 
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To go on to three, four, and higher dimensions, we use a more compact 
notation. Let 


x X\ 
y-+ x 2 

an = cos (p , ai 2 = sin <p, 
a 2 i = - sin ip, a 22 = cos ip, 

and z -> etc. in higher dimensions. Then Eq. (2.86) becomes 


( 2 . 88 ) 

(2.89) 


x{ = a n x\ + ai 2 a; 2 , 
= a 2 i^i + a 22 o; 2 . 


(2.90) 


The coefficient a/j is a direction cosine, the cosine of the angle (ppj between x[ 
and Xj\ that is, 


a- 12 = cos ipy 2 = sin ip, 


a 2 i = cos (pzi = cos 



= — sin^>. 


(2.91) 


The advantage of the new notation 9 is that it permits us to use the summation 
symbol and to rewrite Eq. (2.90) as 


2 

x- = ^ dijXj, i = 1, 2. 

3 =1 


(2.92) 


Note that i remains as a parameter that gives rise to one equation when it is 
set equal to 1 and to a second equation when it is set equal to 2. The index j, of 
course, is a summation index, a dummy index. As with a variable of integration, 
j may be replaced by any other convenient symbol. 

The generalization to three, four, or N dimensions is now very simple. The 
set of N quantities, Vj, is said to be the components of an Af-dimensional 
vector, V, if and only if their values relative to the rotated coordinate axes are 
given by 

N 

v; = J2 a <3 V h (2.93) 

3 =1 


As before, ay is the cosine of the angle between the x[ and x : j directions. When 
the number of dimensions that our space has is clear, the upper limit N and 
the corresponding range of i will often not be indicated. 


9 If you wonder about the replacement of one parameter ip by four parameters aij, clearly the 
do not constitute a minimum set of parameters. For two dimensions, the four atj are subject to the 
three constraints given in Eq. (2.97). The justification for the redundant set of direction cosines 
is the convenience it provides. It is hoped that this convenience will become more apparent 
in the following discussion. For three-dimensional rotations (nine aij, of which only three are 
independent) alternate descriptions are provided by (i) the Euler angles discussed in Section 3.3, 
(ii) quaternions, and (iii) the Cayley-Klein parameters. These alternatives have their respective 
advantages and disadvantages. 
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From the definition of ay as the cosine of the angle between the positive x[ 
direction and the positive Xj direction, we may write (Cartesian coordinates) 


10 


CLi-i — 


Using the inverse rotation (<p 


dx[_ 
3 Xj' 


-<p) yields 


(2.94) 


x. 


= E< 

i= 1 


or 


dxj_ 

dx- 


— a 


v 


(2.95) 


Notice that this is only true for Cartesian coordinates. 

By use of Eqs. (2.94) and (2.95), Eq. (2.93) becomes 



N r)x' N r)r- 

v[ = y ~^Vj = y E y i- 
* U dx j 

(2.96) 

The direction cosines ay satisfy an orthogonality condition 



^ ^ ttijUik — &jk 
% 

(2.97) 

or, equivalently, 

^ ^ UjiMki — &jk • 

(2.98) 


i 


The symbol 8j k in Eqs. (2.97) and (2.98) is the Kronecker delta, defined by 


8 jk = 1 for j = k, 
Sj k = 0 for j ^ k. 


(2.99) 


Equations (2.97) and (2.98) can be verified in the two-dimensional case by sub¬ 
stituting the specific ay from Eq. (2.89). The result is the well-known identity 
sin 2 <p + cos 2 ip — 1. To verify Eq. (2.98) in general form, we use the partial 
derivative forms of Eqs. (2.94) and (2.95) to obtain 


% ' dXj 3x k v ' 3 Xj 3x^ dxj 
E 3x[ dx■ E dx! dx k 3x k ' 


( 2 . 100 ) 


The last step follows by the standard rules for partial differentiation, assuming 
that Xj is a function of x[, .x), .x), and so on. The final result, 3xy /3x k , is equal to 
8j k since xj and x k as coordinate lines (j ^ k) are assumed to be perpendicular 
(two or three dimensions) or orthogonal (for any number of dimensions). 
Equivalently, Xj and x k (j ^ k) are totally independent variables. If j = k, the 
partial derivative is clearly equal to 1. 

In redefining a vector in terms of the way its components transform under 
a rotation of the coordinate system, we should emphasize two points: 


• This more precise definition is not only useful but necessary in describing 
our physical world. When all vectors in a vector equation transform covari- 
antly (i.e., transform just like the coordinate vector), the vector equation is 


10 Differentiate x[ with respect to xj. See the discussion at the start of Section 1.6 for the definition 
of partial derivatives. Section 3.3 provides an alternate approach. 
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in essence independent of any particular coordinate system. (The coordi¬ 
nate system need not even be Cartesian.) A vector equation can always be 
expressed in some coordinate system. This independence of reference 
frame (implied by covariance) is needed to formulate universal laws of 
physics involving vectors or tensors more generally. 

• This definition is subject to a generalization that will form the basis of the 
branch of mathematics known as tensor analysis. 

On this basis, the behavior of the vector components under rotation of the 
coordinates is used next to prove that an inner product is a scalar, and then to 
prove that an outer product is a vector, and finally to show that the gradient 
of a scalar, V (/, is a vector. 


Biographical Data 

Kronecker, Leopold. Kronecker, a German mathematician, was born in 
Liegnitz, Germany (now Poland), in 1823 and died in Berlin in 1891. Son of 
a Jewish merchant, he ran the family business and was well off, so he could 
afford to retire at age 30. He obtained his Ph.D. in 1845 in Berlin, and by 
1883 he was a professor. He developed number theory and algebra but was 
obsessed with integers to the point of dismissing Cantor’s transfinite num¬ 
bers and Weierstrass’s analysis, coining the phrase “God made the integers, 
all else is the work of man.” 



Invariance of the Scalar Product under Rotations 


We have not formally shown that the word scalar is justified or that the scalar 
product is indeed a scalar quantity—that is, stays invariant under rotations of 
the coordinates. To show this, we investigate the behavior of A ■ B under a 
rotation of the coordinate system. By use of Eq. (2.93), 


AB = E A'.lj; = ay, A; a-ijBj + E a *i A i E ®2 j b j 

i 1 j i j 

+ Y a siA Y ( 2 . 101 ) 

* j 

Using the indices k and l to sum over 1, 2, and 3 indices, we obtain 

E A ’A = E E E a >- A < ( 2 - 102 ) 

k l i j 

and, by rearranging the terms on the right-hand side, we have 

E A k B k — E E E( a/ * Qij^AiBj 

k l i j 

-EE ^ A iBj = E A ’ I{ r (2.103) 

i j j 

The last two steps follow by using Eq. (2.98), the orthogonality condition of 
the direction cosines, and Eq. (2.99), which defines the Kronecker delta. The 
effect of the Kronecker delta is to cancel all terms in a summation over either 
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index except the term for which the indices are equal. In Eq. (2.103), its effect 
is to set j = i and to eliminate the summation over j. Of course, we could 
equally well set i = j and eliminate the summation over i. Equation (2.103) 
gives us 

E A ^ = E^> c 2 - 104 ) 

k i 

which is our definition of a scalar quantity—one that remains invariant under 
the rotation of the coordinate system. 


Covariance of Cross Product 


For the cross product there remains the problem of verifying that C = A x B 
is indeed a vector; that is, it obeys Eq. (2.93), the vector transformation law. 
That is, we want to show that the cross product transforms like the coordinate 
vector, which is the meaning of covariance here. Starting in a rotated (primed) 
system 


C- = A'jB' k — Alj.B'-, i, j, and k in cyclic order, 

= E^E E O'W A; 

^ ' O'jmBm 

l m l m 

— ^ ' ( Q'jlQ'km U kj U/m) J 1/ B m . (2.105) 

The combination of direction cosines in parentheses vanishes for m=l. We 
therefore have j and k taking on fixed values, dependent on the choice of i, 
and six combinations of l and to. If i = 3, then j = 1, k = 2 (cyclic order), and 
we have the following direction cosine combinations : 11 


fflll«22 — 021^12 = ^33, 

ai 3 a 2 i - a 2 3ffln = a32, (2.106) 

a 12«23 — «22ffll3 = ffl31- 

Equations (2.106) are identities satisfied by the direction cosines. They can 
be verified with the use of determinants and matrices (see Exercise 3.3.3). 
Substituting back into Eq. (2.105), 

C 3 = aMB 2 — A 2 BJ + a 22 (A 2 Bi — A^B 2 ) + a 2 i^A 2 B 2 — A 3 .B 2 ) 

= 0 - 3 lCl + Cl 3 2 C 2 + (I 33 C 3 = a 3 nC n - (2.107) 

n 

By permuting indices to pick up C\ and C' 2 , we see that Eq. (2.93) is satisfied 
and C is indeed a vector. It should be mentioned here that this vector nature 
of the cross product is an accident associated with the three-dimensional 


11 Equations (2.106) hold for rotations because they preserve volumes. For a more general orthog¬ 
onal transformation the r.h.s. of Eq. (2.106) is multiplied by the determinant of the transformation 
matrix (see Chapter 3 for matrices and determinants). As a consequence, the cross product is a 
pseudovector rather than a vector. 
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nature of ordinary space. 12 We shall also see that the cross product of two 
vectors may be treated as a second-rank antisymmetric tensor. 


Covariance of Gradient 


There is one more ambiguity in the transformation law of a vector, Eq. (2.93), 
in which «.y is the cosine of the angle between the x- -axis and the Xj- axis. If 
we start with a differential distance vector dr, and taking dx- to be a function 
of the unprimed variables 


dx[ = Y dxj 

1 ^ dXi 3 


(2.108) 


by partial differentiation (see Section 1.5 for the total differential), then Eqs. 
(2.108) and (2.92) are consistent. Any set of quantities Vj transforming accord¬ 
ing to Eq. (2.96) is defined as a contravariant vector. 

However, we have already encountered a slightly different type of vector 
transformation. The gradient V<p defined by Eq. (1.64) (using X\, X2, X3 for x, 
y, z) transforms as 

— _ y^ %<P_ d%l _ y^ dxj dtp^ ^ 

dx! y dxj dx! L* dx! ab¬ 


using the chain rule and <p = <p(x, y, z ) = ip(x', y', z ') = <p' with (p defined as a 
scalar quantity. Notice that this differs from Eqs. (2.93) and (2.94) in that we 
have dXj/dx! instead of dx!/dxj. Equation (2.109) is taken as the definition of 
a covariant vector with the gradient as the prototype. 

In Cartesian coordinates only, 


dx* dx 
—7 = —~ = an, 
dx! dxj 3 


( 2 . 110 ) 


and there is no difference between contravariant and covariant trans¬ 
formations. 

In other systems, Eq. (2.110) in general does not apply, and the 
distinction between contravariant and covariant is real and must be 
observed. This is of prime importance in the curved Riemannian space 
of general relativity. Let us illustrate the difference between covariant and 
contravariant vectors in Minkowski space of special relativity. 

In the remainder of this section, the components of a contravariant vector 
are denoted by a superscript, A 1 , whereas a subscript is used for the com¬ 
ponents of a covariant vector A. 13 


12 Specifically, Eq. (2.106) holds only for three-dimensional space. For a far-reaching generalization 
of the cross product, see D. Hestenes and G. Sobczyk, Clifford Algebra to Geometric Calculus. 
Reidel, Dordrecht (1984). 

13 This means that the coordinates (x, y , z) should be written (x 1 , x 2 , x 3 ) since r transforms as a 
contravariant vector. Because we restrict our attention to Cartesian tensors (where the distinc¬ 
tion between contravariance and covariance disappears), we continue to use subscripts on the 
coordinates. This avoids the ambiguity of x 2 representing both x squared and y. 
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EXAMPLE 2.6.1 


Gradient in Minkowski Space The distance x 2 = c 2 t 2 -r 2 of the coor¬ 
dinate four-vector x jL = ( ct , r) defines the (indefinite) metric of Minkowski 
space, the four-dimensional space-time of special relativity. The scalar prod¬ 
uct a ■ b — aobo — a • b has the same relative minus sign. Four-vectors transform¬ 
ing like x 1 ' are called contravariant and those like x (l = (ct, —r) that contain 
the minus sign of the metric are called covariant so that the distance becomes 
x 2 = x: ll x v and the scalar product simply a ■ b = a 1 'b tl . Covariant and 

contravariant four-vectors clearly differ. The relative minus sign between them 
is important, as the gradient 3 M = (y.fj, — V) and the continuity equation of 
the current -J' L = (cp, J), 

E = jt - (~ v • J) = 9 • J ’ 

show. The reason for the minus sign in the definition of the gradient is that 
3 M = . This can be checked by the derivative of the metric 

dx 2 = = 2xll 

dXp dx^ 

which transforms (contravariantly) like the coordinate vector xJ‘ , and = 

oXfi 

(-- -V) 
at’ v J- 


Definition of Tensors of Rank Two 


To derive the principal moments of inertia, we rotate the inertia ellipsoid to its 
principal axes. Thus, we need to obtain the principal moments from the original 
moments. That is, we have to generalize the transformation theory of vectors 
to tensors as well. Now we proceed to define contravariant, mixed, and 
covariant tensors of rank 2 by the following equations for their components 
under coordinate transformations (which are separate vector transformations 
for each index of a tensor): 


= £ 

kl 


3a* dxi ’ 


B’i = y dx i dXl b’i 

kl 


3 x k dx'- 


( 2 . 111 ) 


r > _ ^ Xk ^ Xl r 

G d 2 -, 3r' 0 “' 


dx- dx'j 

Clearly, the rank goes as the number of indices and, therefore, partial deriva¬ 
tives (or direction cosines) in the definition: zero for a scalar, one for a vector, 
two for a second-rank tensor, and so on. Each index (subscript or superscript) 
ranges over the number of dimensions of the space. The number of indices 
(rank of tensor) is independent of the dimensions of the space. A ld is con¬ 
travariant with respect to both indices, is covariant with respect to both 
indices, and Bj° transforms contravariantly with respect to the first index li but 
covariantly with respect to the second index l. Again, if we are using Cartesian 
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coordinates, all three forms of the tensors of second rank—contravariant, 
mixed, and covariant—are the same. 

As with the components of a vector, the transformation laws for the com¬ 
ponents of a tensor [Eq. (2.111)] yield entities (and properties) that are inde¬ 
pendent of the choice of reference frame. This is what makes tensor analysis 
important in physics. The independence of reference frame (covariance) is 
ideal for expressing and investigating universal physical laws. 

The second-rank tensor A (components A ld ) may be conveniently repre¬ 
sented by writing out its components in a square array (3 x 3 if we are in 
three-dimensional space), 



/A 11 

A 12 

a 13 \ 

A = 

A 21 

A 22 

A 23 


v a 31 

A 32 

A 33 j 


This does not mean that any square array of numbers or functions forms a 
tensor. The essential condition is that the components transform according to 
Eq. (2.111). 

Examples of tensors abound in physics and engineering: The inertia tensor 
[see Eqs. (3.126)-(3.128)], the Kronecker S, multipole moments of electro¬ 
statics and gravity, the antisymmetric Levi-Civita symbol [see Eq. (2.133)] of 
rank three in three-dimensional Euclidean space, the metric, Ricci and energy- 
momentum tensors of general relativity, and the electromagnetic field tensor 
(see Example 2.6.2) are all second-rank tensors, and the Riemann curvature 
tensor is of rank four. 

In the context of matrix analysis, the preceding transformation equations 
become (for Cartesian coordinates) an orthogonal similarity transformation 
(Section 3.3). A geometrical interpretation of a second-rank tensor (the inertia 
tensor) is developed in Section 3.5. 


Addition and Subtraction of Tensors 


The addition and subtraction of tensors is defined in terms of the individual 
elements just as for vectors. If 


A + B = C, 


(2.113) 


then 


A ij + B'i = C ij_ 


Of course, A and B must be tensors of the same rank and both expressed in a 
space of the same number of dimensions. 


Summation Convention 


In tensor analysis it is customary to adopt a summation convention to put 
Eq. (2.111) and subsequent tensor equations in a more compact form. As long as 
we are distinguishing between contravariance and covariance, let us agree that 
when an index appears on one side of an equation, once as a superscript and 
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once as a subscript (except for the coordinates, where both are subscripts), we 
automatically sum over that index. Then we may write the second expression 
in Eq. (2.111) as 

dx! 3 Xi 


B] = 


_ 

3 x k dx'- 1 ’ 


(2.114) 


with the summation of the right-hand side over k and i implied. This is Ein¬ 
stein’s summation convention. 14 

To illustrate the use of the summation convention and some of the 
techniques of tensor analysis, let us show that the now familiar Kronecker 
delta, 8^, is really a mixed tensor of rank two, <5*. 15 Does <5f transform ac¬ 
cording to Eq. (2.111)? This is our criterion for calling it a tensor. Using the 
summation convention, Sf transforms into 


k dx'I dXi dx- dx k 
1 3 Xu dx'j dXk dx'j ’ 

the last step by definition of the Kronecker delta. Now 

dx- dX/c dx! 


dx k dXj 


dx! 


(2.115) 


(2.116) 


by direct partial differentiation of the right-hand side (chain rule). However, 
x- and Xj are independent coordinates, and therefore the variation of one with 
respect to the other must be zero if they are different and unity if they coincide; 
that is, 


dx! i ■ 


(2.117) 


Hence, 


o/i 

s j = 


3 Xk dx! 1 ’ 


showing that the Sj° are indeed the components of a mixed second-rank tensor. 
Notice that this result is independent of the number of dimensions of our space. 

Because the Kronecker delta has the same components in all of our rotated 
coordinate systems, it is called isotropic. In Section 2.9, we discuss a third- 
rank isotropic tensor and three fourth-rank isotropic tensors. No isotropic 
first-rank tensor (vector) exists. 


Symmetry-Antisymmetry 


The order in which the indices appear in our description of a tensor is im¬ 
portant. In general, A mn is independent of A nm , but there are some cases of 
special interest. If, for all m and n, 


j^rnn _ j^nm 


(2.118) 


14 In this context, dxUdxu might better be written as al and dxi/dx '• as bL 

it J J 

It is common practice to refer to a tensor A by specifying a typical component, Ay. As long as 
one refrains from writing nonsense such as A = Ay, no harm is done. 
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EXAMPLE 2.6.2 


P Spinors 


SUMMARY 


we call the tensor symmetric. If, on the other hand, 

A mn =-A nm , (2.119) 

the tensor is antisymmetric. Clearly, every (second-rank) tensor can be re¬ 
solved into symmetric and antisymmetric parts by the identity 

r = 1 (A mH + A nm ) + i ( A mn - A n/m ), (2.120) 

the first term on the right being a symmetric tensor and the second term an 
antisymmetric tensor. A similar resolution of functions into symmetric and 
antisymmetric parts is of extreme importance in quantum mechanics. 

Inertia, Quadrupole, and Electromagnetic Field Tensors The off- 
diagonal moments of inertia, such as I xy = — show that 

the inertia tensor is symmetric. Similarly, the electric quadrupole moments 
Qij = f (3 XiXj — r 2 &ij)p(r)d 3 r = Qji form a manifestly symmetric tensor. In 
contrast, the electromagnetic field tensor F llv = d 11 A’ — 9 A' with the four- 
vector gradient 9'' = ( ’ , — V) is manifestly antisymmetric. 


It was once thought that the system of scalars, vectors, tensors (second rank), 
and so on formed a complete mathematical system—one that is adequate for 
describing a physics independent of the choice of reference frame. However, 
the universe and mathematical physics are not this simple. In the realm of 
elementary particles, for example, spin zero particles 16 (it mesons and a par¬ 
ticles) may be described with scalars, spin 1 particles (deuterons) by vectors, 
and spin 2 particles (gravitons) by tensors. This listing omits the most common 
particles: electrons, protons, neutrinos, and neutrons, all with spin |. These 
particles are properly described by spinors, which are two-component wave 
functions—one for spin up and the other for the spin down state. A spinor is 
not a scalar, vector, or tensor. A brief introduction to spinors in the context of 
group theory appears in Section 4.3. 


Tensors are systems of components organized by one, two, or more indices that 
transform according to specific rules under a set (group) of transformations. 
The number of indices is called the rank of the tensor. If the transformations are 
rotations in three-dimensional space, then tensor analysis amounts to what we 
did in the sections on curvilinear coordinates and in Cartesian coordinates in 
Chapter 1. In four dimensions of Minkowski space-time, the transformations 
are Lorentz transformations. Here, tensors of rank 1 are the four-vectors of 
Chapter 4. 


16 The particle spin is intrinsic angular momentum (in units of K). It is distinct from classical, 
orbital angular momentum due to motion. 
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EXERCISES 

2.6.1 (a) Show that the magnitude of a vector A, A = (Af r + A^) 1 '' 2 , is inde¬ 

pendent of the orientation of the rotated coordinate system 

(A^. + Ay) / = + Ay) / 

independent of the rotation angle cp. 

This independence of angle is expressed by saying that A is invariant 
under rotations. 

(b) At a given point {pc, y), A defines an angle a relative to the positive 
.x-axis and a' relative to the positive .r'-axis. The angle from x to x' 
is (p. Show that A = A' defines the same direction in space when 
expressed in terms of its primed components and in terms of its 
unprimed components; that is, 

a! = a. — <p. 

2.6.2 Show that if all components of a tensor of any rank vanish in one partic¬ 
ular coordinate system, they vanish in all coordinate systems. 

Note. This point takes on special importance in the four-dimensional 
curved space of general relativity. If a quantity, expressed as a tensor, 
is defined in one coordinate system, it exists in all coordinate systems 
and is not just a consequence of a choice of a coordinate system (as are 
centrifugal and Coriolis forces in Newtonian mechanics). 

2.6.3 The components of tensor A are equal to the corresponding components 
of tensor B in one particular coordinate system; that is, 


Show that tensor A is equal to tensor B, Ay = B y, in all coordinate 
systems. 

2.6.4 The first three components of a four-dimensional vector vanish in each 
of two reference frames. If the second reference frame is not merely a 
rotation of the first about the .1,4 axis—that is, if at least one of the coeffi¬ 
cients a f ; 4 (i = 1, 2, 3) ^ 0 —show that the fourth component vanishes in 
all reference frames. Translated into relativistic mechanics this means 
that if momentum is conserved in two Lorentz frames, then energy is 
conserved in all Lorentz frames. 

2.6.5 From an analysis of the behavior of a general second-rank tensor under 
90° and 180° rotations about the coordinate axes, show that an isotropic 
second-rank tensor in three-dimensional space must be a multiple of <5y. 

2.6.6 Tikim is antisymmetric with respect to all pairs of indices. How many 
independent components has it (in three-dimensional space)? 



2.7 Contraction and Direct Product 


149 


D 


2.7 Contraction and Direct Product 


Contraction 


When dealing with vectors, we formed a scalar product (Section 1.2) by 
summing products of corresponding components: 


A B = AiBi (summation convention). (2.121) 


The generalization of this expression in tensor analysis is a process known as 
contraction. Two indices, one covariant and the other contravariant, are set 
equal to each other and then (as implied by the summation convention) we 
sum over this repeated index. For example, let us contract the second-rank 
mixed tensor B! 1 : 


B'f 


B , i= K^± B k = ^± B k 

dx k dx- 1 dx k 1 


by Eq. (2.98), and then by Eq. (2.99), 


( 2 . 122 ) 


K = « = B k k . 


(2.123) 


Remember that i and k are both summed. Our contracted second-rank mixed 
tensor is invariant and therefore a scalar. 17 This is exactly what we obtained 
in Eq. (2.102) for the dot product of two vectors and in Section 1.6 for the 
divergence of a vector. 


SUMMARY 


Direct Product 


In general, the operation of contraction reduces the rank of a tensor by two. 
Examples of the use of contraction abound in physics and engineering: The 
scalar product of two vectors, the triple scalar product for the volume spanned 
by three vectors, determinants as contractions of the antisymmetric Levi- 
Civita tensor with n vectors in n-dimensional Euclidean space are discussed 
in Chapter 3. 


The components of a covariant vector (first-rank tensor) oj and those of a 
contravariant vector (first-rank tensor) b :l may be multiplied component by 
component to give the general term aB’. By Eq. (2.111), this is actually a 
mixed second-rank tensor for 


, ulj dx k dXj dx k dx'j 

aLb 1 = — -a k —-6 = —t— -(a k b l \ 
‘ dxI dx, dx; dxi 


(2.124) 


Note that the tensors 6,a J , Ojbj, a'b ' are all different. Contracting, we 
obtain 


a! i b' i = a k b k t (2.125) 

as in Eq. (2.104) to give the regular scalar product. 


1 ‘ In matrix analysis, this scalar is the trace of the matrix (Section 3.2). 
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The operation of adjoining two vectors a,- and b 1 as in the last paragraph is 
known as forming the direct product. For the case of two vectors, the direct 
product is a tensor of second rank. In this sense, we may attach meaning to VE, 
which was not defined within the framework of vector analysis. In general, the 
direct product of two tensors is a tensor of rank equal to the sum of the two 
initial ranks; that is, 


AjB = C }' 


ikl 


where C'f 1 is a tensor of fourth rank. From Eqs. (2.111), 


(jrikl 


dx! dx n daj dx[ cmvQ 
dx m 3 Xj 3 x p 3 x q n 


(2.126a) 


(2.126b) 


EXAMPLE 2.7.1 


Direct Product of Vectors Let us consider two-particle spin states in quan¬ 
tum mechanics that are constructed as direct products of single-particle states. 
The spin-* states If), () combine to spin 0 and spin 1. We may also describe 
them by unit vectors 



and direct product states, such as Iff) = f)l t>. are examples of afij for 
i = 1, j = 1 in Eq. (2.124), etc. The triplet spin states can be described as 


ln) -(o o)' V 2 (lt;) + I4 ' t>)_ Vl(l o) “ J? 1 ’ 

iu>=(";). 


The spin 0 singlet state (I t> 14^> ~ I4)lt» asa direct product can then be 
represented by the matrix 


0 

-1 


= rer 2 , 


one of the Pauli spin matrices. ■ 


The direct product is a technique for creating higher rank tensors. Exer¬ 
cise 2.7.1 is a form of the direct product in which the first factor is V. Other 
applications appear in Section 4.6. 

When T is an rath rank Cartesian tensor, ( d/dXi)Tju ..., an element of VT, 
is a Cartesian tensor of rank n+ 1 (Exercise 2.7.1). However, ( d/doCi)Tju ■ ■ ■ 
is not a tensor under more general transformations. In non-Cartesian systems 
d/dx- will act on the partial derivatives dx p /3x' and destroy the simple tensor 
transformation relation. 

So far, the distinction between a covariant transformation and a contravari- 
ant transformation has been maintained because it does exist in non-Cartesian 
space and because it is of great importance in general relativity. Now, 
however, because of the simplification achieved, we restrict ourselves to Carte¬ 
sian tensors. As noted in Section 2.6, the distinction between contravariance 
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SUMMARY 


and covariance disappears and all indices from now on are shown as sub¬ 
scripts. We restate the summation convention and the operation of contraction. 

The operations of tensor algebra include contracting two indices, which gen¬ 
eralizes the trace of a matrix and the scalar or dot product of two vectors to 
arbitrary tensors. The direct product of tensors is converse to contraction by 
forming products of all individual tensor components and enlarging the set of 
indices to comprise those of all tensors involved. 

EXERCISES 

2 . 7.1 If Tj. is a tensor of rank n, show that 3 T...i/dxj is a tensor of rank n + 1 
(Cartesian coordinates). 

Note. In non-Cartesian coordinate systems the coefficients aij are, in 
general, functions of the coordinates, and the simple derivative of a 
tensor of rank n is not a tensor except in the special case of n — 0. In 
this case, the derivative does yield a covariant vector (tensor of rank 1) 
by Eq. (2.109). 

2 . 7.2 If Tijk... is a tensor of rank n, show that dTjjk.../dXj is a tensor of rank 

n — 1 (Cartesian coordinates). 

2 . 7.3 Show that the operator 


is a scalar operator in Minkowski space-time. 


2.8 Quotient Rule 


If Aj and Bj are vectors, as defined in Section 2.6, then the direct product A,Bj 
is a second-rank tensor. Here, we are concerned with recognizing tensors that 
are implicitly defined in some relation. For example, consider such equations 
as 


K l Ai = B 

(2.127a) 

K).V = B : 

(2.127b) 

it. 

II 

it 

(2.127c) 

K ijkl Aij = B kl 

(2.127d) 

KijAk = Bijk, 

(2.127e) 


where in each of these relations A and B are known to be tensors of a rank 
indicated by the number of indices. In each case, K is an unknown quantity. We 
wish to establish the transformation properties of K. Let us demonstrate the 
quotient rule for Eq. (2.127b), which asserts that if the equation of interest 








152 


Chapter 2 Vector Analysis in Curved Coordinates and Tensors 


holds in all (rotated) Cartesian coordinate systems, then K is a tensor of the 
indicated rank. The proof for the other equations is similar and left as an exer¬ 
cise. The importance in physical theory is that the quotient rule can establish 
the tensor nature of quantities that are defined in relations. Exercise 2.8.1 is a 
simple illustration of this. The quotient rule shows that the inertia matrix ap¬ 
pearing in the angular momentum equation L = \u> (Section 3.5) is a tensor 
of rank 2. 

In proving the quotient rule, we consider Eq. (2.127b) as a typical case. In 
our primed coordinate system 

K'/A'-' = B' 1 = alB k , (2.128) 

using the vector transformation properties of B. Since the equation holds in 
all rotated Cartesian coordinate systems, 

aiB k = al(K k A l ). (2.129) 

Now, transforming A back into the primed coordinate system 18 [compare 
Eq. (2.96)], we have 

K^A ,j = aiK k a l jA ,j . (2.130) 

Rearranging, we obtain 

(Kf - aia l jK k ) A ,j = 0. (2.131) 

This must hold for each value of the index i and for every primed coordinate 
system. Since the A :i is arbitrary, 19 we conclude 

K'f = a i k a l j K k , (2.132) 

which is our definition of a mixed second-rank tensor. 

The other equations may be treated similarly, giving rise to other forms of 
the quotient rule. One minor pitfall should be noted: The quotient rule does 
not necessarily apply if B is zero. The transformation properties of zero are 
indeterminate. 


EXAMPLE 2.8.1 


Equations of Motion and Field Equations In classical mechanics, New¬ 
ton’s equations of motion mv = F indicate, on the basis of the quotient rule, 
that if the mass is a scalar and the force a vector, then the acceleration a = v is 
a vector. In other words, the vector character of the force as the driving term 
imposes its vector character on the acceleration provided the scale factor m 
is scalar. 


ls Note the order of the indices of the aj in this inverse transformation. We have 


= y dXl ,A ,j = y <t 4 A'i. 


dx; 


19 We might, for instance, take A' 1 = 1 and A' m = 0 for m / 1. Then the equation K[ l = a’ k a!\ Kf 
follows immediately. The rest of Eq. (2.132) derives from other special choices of the arbitrary 
A’i. 
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The wave equation of electrodynamics B 2 A 11 = -I 1 ' involves the Lorentz 
scalar four-dimensional version of the Laplacian 3 2 = — V 2 and the ex¬ 

ternal four-vector current as its driving term. From the quotient rule, we 
infer that the vector potential A^ is a four-vector as well. If the driving cur¬ 
rent is a four-vector, the vector potential must be of rank one by the quotient 
rule. ■ 


SUMMARY 


The quotient rule is a substitute for the illegal division of tensors. 


EXERCISES 

2 . 8.1 The double summation K' 1 A;Bj is invariant for any two vectors A,- and 
Bj. Prove that K IJ is a second-rank tensor. 

2 . 8.2 The equation K' J Aji- = Bj. holds for all orientations of the coordinate 
system. If A and B are second-rank tensors, show that K is also a second- 
rank tensor. 

2 . 8.3 The exponential in a plane wave is exp[?'(k • r — cot)]. We recognize 
x 11 — ( ct , X \, X 2 , .X;)) as a prototype vector in Minkowski space. If kr — 

is a scalar under Lorentz transformations (Chapter 4), show that AT = 
(oj/c, k\, k 2 , k:>) is a four-vector in Minkowski space. 

Note. Multiplication by h yields (E/c, p) as a four-vector momentum in 
Minkowski space. 


2.9 Dual Tensors 


For future use, it is convenient to introduce the three-dimensional Levi-Civita 
symbol defined by 

£123 = £231 = £312 = 1, 

£132 = £213 = £321 = — 1, (2.133) 

all other ey fc = 0 

in three-dimensional Cartesian space (where we do not have to distinguish 
between upper and lower indices). Note that e#* is antisymmetric with respect 
to all pairs of indices. Suppose now that we have a third-rank tensor 8^, which 
in one particular coordinate system is equal to e^. Then we can show that 

^ijk — dcl(ll j(l l/,(ljq (Brt'pqr. (2.134) 

where det(a) is the determinant (or volume; see Chapter 1) of the coordinate 
rotation. Recall that this determinant factor already occurred in the transfor¬ 
mation law of the cross product in Section 2.6. A tensor transforming according 
to Eq. (2.134) is called a pseudotensor to distinguish it from tensors whose 


Levi-Civita Symbol 
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transformation laws do not contain det(a). For a rotation of the coordinates 
det(a) = 1, but if an axis is inverted or reflected, x -* —x, det(a) = — 1. Now, 

pqr = det(ffl) (2.135) 

by direct expansion ofthe determinant, showing that 5j 23 = det(a ) 2 = 1 = £ 123 . 
Considering the other possibilities, we find 

S-ijk = %'& (2.136) 

for rotations and reflections. Hence, Sijk is a pseudotensor . 20,21 Furthermore, it 
is an isotropic pseudotensor with the same components in all rotated Cartesian 
coordinate systems. 



With any antisymmetric second-rank tensor C (in three-dimensional space) 
we may associate a dual pseudovector Q defined by 

Ci = ^EijkCjk- (2.137) 

Here, the antisymmetric C may be written as a 3 x 3 square array: 



(c 11 

Ci 2 

Ci 3 \ 


( 0 

C12 

-C 3 i\ 


c = 

C 2 i 

C22 

C23 

= 

— Cl 2 

0 

c 23 

(2.138) 


(C 31 

C32 

C33 ) 


C 31 

-C 23 

0 ) 



We know that Ci must transform as a vector under rotations from the double 
contraction of the fifth-rank (pseudo) tensor £y 7i .C mn , but it is really a pseudo¬ 
vector because of the pseudo nature of e^k- Specifically, the components of C 
are given by 


(Ci, C 2 , C 3 ) = (C 23 , C 31 , C 12 ). (2.139) 

Notice the cyclic order of the indices that derives from the cyclic order of the 
components of e^k- This duality, given by Eq. (2.139), means that our three- 
dimensional vector product may be taken to be either a pseudovector or an 
antisymmetric second-rank tensor, depending on how we choose to write it out. 

If we take three (polar) vectors A, B, and C, we may define the direct 
product 


— AiBjCk- 


(2.140) 


20 The usefulness of Eij k extends far beyond this section. For instance, the matrices IVh of Exer¬ 
cise 3.2.11 are derived from = —i£ij k . Much of elementary vector analysis can be written in 

a very compact form by using Eijk and the identity of Exercise 2.9.4. See A. A. Evett, Permutation 
symbol approach to elementary vector analysis. Am. J. Phys. 34, 503 (1966). 

121 The numerical value of Eijk is given by the triple scalar product of coordinate unit vectors: 

qi q, x q k . 

From this standpoint, each element of is a pseudoscalar, but the Eij k collectively form a 
third-rank pseudotensor. 
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By an extension of the analysis of Section 2.6, is a tensor of third rank. 
The dual quantity is 

V = i s ijk Vi jk , (2.141) 

clearly a pseudoscalar. By expansion, it is seen that 


Ai 

B\ 

C 1 

A 2 

b 2 

c 2 

^3 

b 3 

C 3 


is our familiar triple scalar product. 

For use in writing Maxwell’s equations in covariant form, we want to extend 
this dual vector analysis to four-dimensional Minkowski space and, in partic¬ 
ular, to indicate that the four-dimensional volume element, dx o dx\ dx 2 dx 3 , is 
a pseudoscalar. 

We introduce the Levi-Civita symbol Siju, the four-dimensional analog of 
Eijic. This quantity e^u is defined as totally antisymmetric in all four indices, 
which run from 0 to 3. If ( ijkl) is an even permutation 22 of (0, 1, 2, 3), then 
Sjjij is defined as + 1 ; if it is an odd permutation, then ey M is — 1 ; and it is 
defined as 0 if any two indices are equal. The Levi-Civita Siju may be proved 
to be a pseudotensor of rank 4 by analysis similar to that used for establishing 
the tensor nature of e,^. Introducing the direct product of four vectors as 
fourth-rank tensor with components 

H m _ A i B j C k D l , (2.143) 

built from the four-vectors A, B, C, and D, we may define the dual quantity 

H= ^e ijkl H^ u . (2.144) 

We actually have a quadruple contraction that reduces the rank to zero. From 
the pseudo nature of e^u, H is a pseudoscalar. Now we let A, B, C, and D be 
infinitesimal displacements along the four coordinate axes (Minkowski space), 

A = (dx o, 0, 0, 0) 

B = (0, dx i, 0, 0), and so on, 

and 

H — dx o dxi dx 2 dx 3 . 

The four-dimensional volume element is now identified as a pseudoscalar. 
This result could have been expected from the results of the special theory of 
relativity. The Lorentz-Fitzgerald contraction of dx 1 dx 2 dx 3 balances the time 
dilation of dx o- 

We slipped into this four-dimensional space as a simple mathematical 
extension of the three-dimensional space and, indeed, we could just as easily 


(2.145) 

(2.146) 


22 A permutation is odd if it involves an odd number of interchanges of adjacent indices, such as 
(0 12 3)—>- (0 2 1 3). Even permutations arise from an even number of transpositions of adjacent 
indices. (Actually the word “adjacent” is not necessary.) £0123 = 4-1. 
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SUMMARY 


have discussed 5-, 6-, or A'-dimensional space, where the Levi-Civita symbol 
is defined similarly. This is typical of the power of the component analysis. 
Physically, this four-dimensional space may be taken as Minkowski space, 

(x 0 , x\, x 2 , x 3 ~) = ( ct , X, y, z), (2.147) 

where t is time. This is the merger of space and time achieved in special 
relativity. The transformations of four-dimensional space are the Lorentz trans¬ 
formations of special relativity. We encounter these Lorentz transformations 
in Section 4.4. 


Contraction of the antisymmetric Levi-Civita tensor with another antisymmet¬ 
ric tensor or the direct product of vectors leads to their dual tensors. Deter¬ 
minants are pseudoscalar examples, and the cross product of two vectors is 
their dual, another well-known example. 


EXERCISES 


2 . 9.1 An antisymmetric square array is given by 


0 

c 3 

-c 2 \ 

f 0 

C12 

-c 3 

0 

Cl = 

-C12 

0 

c 2 

-Cl 

0 ) 

\ c 13 

—C23 


Cis\ 

C23 I , 

0 / 


where (Ci, C 2 , C 3 ) form a pseudovector. Assuming that the relation 

C% = — Sijk Cjk 

holds in all coordinate systems, prove that Cji c is a tensor. (This is 
another form of the quotient theorem.) 


2 . 9.2 Show that the vector product is unique to three-dimensional space; that 
is, only in three dimensions can we establish a one-to-one correspon¬ 
dence between the components of an antisymmetric tensor (second- 
rank) and the components of a vector. 


2 . 9.3 Show that 

(a) 8a — 3, (b) 8jjEiji c — 0, (c) Sipq £jpq = (d) Ejjj ; Eijj c — 6. 

2 . 9.4 Show that 


SijkSpqk — $ip$jq &iq&jp- 

2 . 9.5 (a) Express the components of a cross product vector C, C = A x B, 

in terms of Eijk and the components of A and B. 

(b) Use the antisymmetry of to show that A ■ A x B = 0. 

ANS. (a) Cj — Ejjji- Aj Jj/' . 

2 . 9.6 (a) Show that the inertia tensor (Section 3.5) may be written as 

l) j — kt l (X ff X ^ j j Xj X j ) 

for a particle of mass mat (ooi, x 2 , # 3 ). Here, x n x n = r 2 . 
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(b) Show that 

lij = MjjMij = mSukXkSij m X m , 

where Mu — rr^ 1 £j] k x k . This is the contraction of two second-rank 
tensors and is identical with the matrix product of Section 3.2. 


2 . 9.7 Write V • V x A and V xVi/) in Sjj k notation so that it becomes obvious 
that each expression vanishes. 


ANS. 


9 9 

V • V x A = £ijk~—-—Ak 

dxi dXj 

9 9 

(V x V<p)i = s ijk ——(p. 

dXj aXk 


2 . 9.8 Expressing cross products in terms of Levi-Civita symbols (siju), derive 
the BAC-CAB rule [Eq. (1.52)]. 

Hint. The relation of Exercise 2.9.4 is helpful. 


2 . 9.9 Verily that each of the following fourth-rank tensors is isotropic; that 
is, it has the same form independent of any rotation of the coordinate 
systems: 

(a) Aijkl — SijSklj 

0-0 J^ijkl — "f“ 9,; l8j k , 

(c) Cijkl — SikSji SuSjk- 


2 . 9.10 Show that the two-index Levi-Civita symbol Sy is a second-rank pseu¬ 
dotensor (m two-dimensional space). Does this contradict the unique¬ 
ness of Sij (Exercise 2.6.5)? 

2 . 9.11 (1) Given A k = \eij k Bij with B;j = —Bji, antisymmetric, show that 


Bynn — £'mnk^k ■ 


(2) Show that the vector identity 

(A x B) • (C x D) = (A ■ C)(B • D) - (A ■ D)(B • C) 

(Exercise 1.4.9) follows directly from the description of a cross 
product with eijk and the identity of Exercise 2.9.4. 


Additional Reading 


Jeffreys, H. (1952). Cartesian Tensors. Cambridge Univ. Press, Cambridge. 
This is an excellent discussion of Cartesian tensors and their application 
to a wide variety of fields of classical physics. 

Lawden, D. F. (1982). An Introduction to Tensor Calculus, Relativity and 
Cosmology, 3rd ed. Wiley, New York. 

Margenau, H., and Murphy, G. M. (1956). The Mathematics of Physics and 
Chemistry, 2nd ed. Van Nostrand, Princeton, NJ. Chapter 5 covers curvi¬ 
linear coordinates and 13 specific coordinate systems. 

Morse, P. M., and Feshbach, H. (1953). Methods of Theoretical Physics. 
McGraw-Hill, New York. Chapter 5 includes a description of several 
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different coordinate systems. Note that Morse and Feshbach are not above 
using left-handed coordinate systems even for Cartesian coordinates. Else¬ 
where in this excellent (and difficult) book there are many examples of 
the use of the various coordinate systems in solving physical problems. 
Eleven additional fascinating but seldom encountered orthogonal coordi¬ 
nate systems are discussed in the second edition of Mathematical Methods 
for Physicists (1970). 

Ohanian, H. C., and Ruffini, R. (1994). Gravitation and Spacetime, 2nd ed. 
Norton, New York. A well-written introduction to Riemannian geometry. 

Sokolnikoff, I. S. (1964). Tensor Analysis—Theory and Applications, 2nd ed. 
Wiley, New York. Particularly useful for its extension of tensor analysis to 
non-Euclidean geometries. 

Young, E. C. (1993). Vector and Tensor Analysis, 2nd ed. Dekker, New York. 




Determinants and 
Matrices 


3.1 Determinants 


We begin the study of matrices by solving linear equations, which will lead 
us to determinants and matrices. The concept of determinant and the nota¬ 
tion were introduced by the renowned German mathematician, physicist, and 
philosopher G. W. von Leibniz and developed further by the German math¬ 
ematician C. Jacobi. One of the major applications of determinants is in the 
establishment of conditions for the existence of nontrivial solutions for a set of 
linear equations. Matrices serve to write linear equations more compactly and 
also represent coordinate transformations such as rotations in Section 3.3 
on orthogonal matrices and unitary matrices as their complex analogs in 
Section 3.4. Eigenvector-eigenvalue problems of symmetric matrices (Hermi- 
tian matrices for complex numbers) in Section 3.5 are of central importance 
in geometry, physics, engineering, and astronomy. 


Linear Equations: Examples 


Electrical circuits typically lead one to sets of linear equations. When a bat¬ 
tery supplies a DC voltage V across a resistance R, the current / flowing is 
related to the voltage by Ohm’s law V = RI. (At this stage, we do not include 
a capacitance or an inductance with AC voltage because they would lead us 
to complex numbers.) More complicated circuitry involves Kirchhoffs law, 
which stipulates that at a junction the sum of all incoming currents equals that 
of all outflowing currents. The sum of potential drops around a loop is equal 
to the input voltage. Let us illustrate a case. 


0 


EXAMPLE 3.1.1 


Battery-Powered Circuit The circuit shown in Fig. 3.1 contains two loops: 
one involving the voltage V\ and V 3 reinforcing each other and one involving 
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Figure 3.1 

Battery-Powered 

Circuit 



+ 




r 2 


r 3 

A/vV 


A/W 

Ri 


V\ and \ r i in opposite directions. Using the lower loop with currents I\, / 3 we 
set up the linear equations 


R\ h + R3I3 = U + Uj, /1 = h + h- 

The upper loop with currents I\, i 2 sets up the second set of linear equations 


R\I\ + R-ih = Vi — V-i, I3 = h — h, 


where Ij are unknown. For Il\ = 4 ohm, R-z = 3 ohm, R] = 2 ohm and U = 1V, 
V 2 = 2 V, V 3 = 3 V, we have to solve 

4Ii + 3/2 = 1 — 2 = — 1 , 4/i + 2 / 3 = 1 + 3 = 4, h = h-I 2 

for Ii, I 2 , 13. Upon substituting I3 = I\ — I 2 we find the linear equations 

4/ 1 +3/ 2 = -l, 3 /i -I 2 = 2. 

Multiplying the second equation by 3 and adding it to the first eliminates / 2 so 
that fi = | A results. The second equation yields / 2 = -2 + 3/i = 15 ^ 26 = 
— A and finally I 3 = I\ — 7 2 = = || A. Given the large resistance R\ 

and small voltage U, it is reasonable that /1 is the smallest current and I3 the 
biggest because the resistance R.\ is the smallest and the driving voltage V 3 is 
the highest. ■ 


Homogeneous Linear Equations 


We start with homogeneous equations, where no terms independent of the 
unknown Xi occur. Determinants give a condition when homogeneous alge¬ 
braic equations have nontrivial solutions. Suppose we have three unknowns 
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X\, X 2 , X3 


a\X\ + a 2X0 + 03X3 = 0 , 

bixi + b 2 x 2 + b 3 x 3 = 0, (3.1) 

ci^i + c 2 x 2 + C3X3 = 0. 

The problem is to determine under what conditions there is any solution, 
apart from the trivial one X\ = 0, x 2 = 0, x 3 = 0. If we use vector notation x = 
(xi, x 2 , x 3 ) for the solution and three rows a = (a\, a 2 , 0.3), b = (hi, b 2 , 63), c = 
(ci, C2, C3) of coefficients, then the three equations of Eq. (3.1) become 

a ■ x = 0, b • x = 0, c • x = 0. (3.2) 

These three vector equations have the geometrical interpretation that x is 
orthogonal to a, b, and c or lies in the intersection of the planes through the ori¬ 
gin that are orthogonal to a, b, and c, respectively, because each homogeneous 
linear equation represents a plane through the origin. If the volume spanned 
by a, b, c given by the triple scalar product [see Eq. (1.49) of Section 1.4] or 
the determinant 


D 3 = (a x b) • c = det(a, b, c) = 


CLi 

a 2 

03 

b\ 

b 2 

b 3 

Cl 

c 2 

c 3 


(3.3) 


is not zero, then there is only the trivial solution x = 0 because x cannot be 
perpendicular to all a, b, c unless a, b, c are not all independent. Conversely, if 
the previous triple scalar product or determinant of coefficients vanishes, then 
one of the row vectors is a linear combination of the other two. Let us assume 
that c lies in the plane spanned by a and b, that is, the third equation Eq. (3.2) is a 
linear combination of the first two and not independent. Then x is orthogonal to 
a and b so that x ~ a x b. To get rid of the unknown proportionality constant we 
form ratios of the unknown Xi. Since homogeneous equations can be multiplied 
by arbitrary numbers, only ratios of the Xi are relevant, for which we then obtain 
ratios of the components of the cross product, or 2 x 2 determinants, 


X1/X3 = (a 2 b 3 - a 3 b 2 )/(a 1 b2 - a 2 b 1 ~) 
x 2 /x 3 = ~(aib 3 - a 3 &i)/(ai& 2 - a 2 &i) 


(3.4) 


from the components of the cross product a x b provided x 3 (i\b 2 ~a 2 b\ ^ 0. 
Il'.i/j = 0 but x 2 ^ 0, we can similarly find the ratio x\/x 2 . That is, .2:3 does not 
play a special role. 


Inhomogeneous Linear Equations 


The simplest case of two equations with two unknowns 


a\X\ + a 2 x 2 = a 3 , 


b\X\ + b 2 x 2 = 63 


(3-5) 
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has the geometrical meaning of the solution as the intersection of two lines. 
This problem can be reduced to the previous homogeneous case by imbedding 
it in three-dimensional space with a solution vector x = (x\, x 2 , — 1 ) and row 
vectors a = (ai, a 2 , a 3 ), b = (&i, bo, 63 ). Note that —1 is the third component 
of x in order to account for the inhomogeneous constants a 3 , b 3 in Eq. (3.5). 
As before, Eq. (3.5) in vector notation, a ■ x = 0 and b ■ x = 0, implies that 
x ~ a x b so that the analog of Eq. (3.4) holds. For this to apply, though, the 
third component of a x b must not be zero (i.e., (i\ bo — a 2 b\ M 0 ) because the 
third component of x is —1 and not zero. This yields the xi as 


xi = (a 3 b 2 - b 3 ao)/(aib 2 - a 2 bi) = 


a 3 a 2 

b 3 b 2 


ai a 2 

b, b 2 ’ 


(3.6a) 


x 2 = (aib 3 - a 3 bi)/(aib 2 - a 2 &0 = 


ai a 3 

b\ b 3 


ai a 2 
b\ b 2 


(3.6b) 


This is Cramer’s rule for two inhomogeneous linear equations with two un¬ 
knowns x\, x 2 \ The determinant D\(Do) in the numerator of X\ (,r; 2 ) is obtained 
from the determinant of the coefficients D = | ^ ^ I in the denominator by 
replacing the first (second) column vector by the vector (^ ) of the inhomoge¬ 
neous side of Eq. (3.5). Cramer’s rule generalizes to Xi — Di/D for n inhomo¬ 
geneous linear equations with n unknowns, n= 2, 3, ..., where D, and D are 
the corresponding n x n determinants. 


Biographical Data 

Cramer, Gabriel. Cramer, a Swiss mathematician, was bom in 1704 and 
died in 1752. His main contributions are in algebra and geometry. Cramer’s 
rule was known to Leibniz already but was first published posthumously 
by Mclaurin in his Treatise on Algebra in 1748. Cramer’s form of the mle 
appeared in his well-written book in French (1750) on algebraic curves, 
which became a standard reference so that he is often given credit for the 
rule. 


EXAMPLE 3.1.2 


Battery-Powered Circuit In order to apply this solution to Example 3.1.1, 
let us write the current equations as 


R\ 1 1 + RoI 2 = Vi - V 2 , Rih + RAM - 1 2 ) = Mix + RxMx - RMi = D + C, 


so that a 3 = Vj — V 2 , b 3 = V 3 + V 3 , and the determinants are 



Ri 

r 2 


V\ — V 2 

r 2 

D = 



, Di = 




Ri + R 3 

-Ra 

Vi + Vg 

-Ra 


Rx Hi - V 2 

f?i + R 3 Vi + v 3 


If we plot Fig. 3.1 in the simpler form of Fig. 3.2, then the last equation can be 
read off the lower loop involving the current I \. According to Cramer’s rule 
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Figure 3.2 

Battery-Powered 

Circuit 



the general solution becomes 

I _ D i _ (Vi - VaX-ife) - (Vr + V 3 )R 2 
1 D R,(-R :i ) - R>(Ri + Rs) ’ 

7 = £2 = A(Vi + V 3 ) - (Vi - V2XR1 + R-d 

D R\ R\ + R)(R] + R>) 

Plugging in the numbers we get (in amps) 

2 - 4-3 5 4-4 + 6 11 

1 “ - 4 - 2 - 3-6 “ 13 ’ 2_ 26 _ 13 ’ 

confirming our previous results. ■ 

These solutions in terms of ratios of determinants can be generalized to n 
dimensions. The nxn determinant is a number that is represented by a square 
array 


^1 ^2 * * * ^ n 


D n = 


b 1 b 2 ■■■ b n 
c 1 C 2 • • • c n 


(3.7) 


of numbers (or functions), the coefficients of n linear equations in our case. 
The number n of columns (and of rows) in the array is sometimes called the 
order of the determinant. The generalization of the expansion in Eq. (1.46) of 
the triple scalar product (of row vectors of three linear equations) leads to the 
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following value of the determinant D n in n dimensions: 

n 

IKi ^ • • •, (3.8) 

i,j,k ~=1 

where e^*..., the Levi-Civita symbol introduced in Section 2.9, is +1 for even 
permutations 1 (ijk ■ • •) of (123 ■ ■ ■ to), — 1 for odd permutations, and zero if any 
index is repeated. 

Specifically, for the third-order determinant D 3 of Eq. (3.3), Eq. (3.7) leads 
to 


Z> 3 = +a 1 b 2 c 3 - aib 3 c 2 - a 2 bic 3 + a 2 b 3 Ci + a 3 bic 2 - a 3 b 2 C]_. (3.9) 

Several useful properties of the nth-order determinants follow from Eq. 
(3.8). Again, to be specific, Eq. (3.9) for third-order determinants is used to 
illustrate these properties. 


Laplacian Development by Minors 


Equation (3.9) may be written as follows: 


As = ai(b 2 c 3 - & 3 c 2 ) - a 2 (b 1 C 3 - b 3 Ci) + a 3 (&ic 2 - 6 2 ci) 


b 2 

b 3 

- a 2 

b\ 

b 3 

+ a 3 

b 1 

b-2 

C2 

C3 


Cl 

C 3 


Cl 

C2 


(3.10) 


The number of terms in the sum [Eq. (3.8)] is 24 for a fourth-order determinant 
and n\ for an nth-order determinant. Because of the appearance of the negative 
signs in Eq. (3.9) (and possibly in the individual elements as well), there may 
be considerable cancellation. It is quite possible that a determinant of large 
elements will have a very small value. 


EXAMPLE 3.1.8 


Numerical Evaluation of a Hilbert Determinant To illustrate this point, 
let us calculate the 3x3 Hilbert determinant expanding it according to 
Eq. (3.10) along the top row 


1 

1/2 

1/3 

1/2 

1/3 

1/4 

1/3 

1/4 

1/5 


1/3 

1/4 

1 

1/2 

1/4 

1 

+ — 

1/2 

1/3 

1/4 

1/5 

2 

1/3 

1/5 

3 

1/3 

1/4 


1 1 , 1 1 

15-16 _ 10 • 12 + 3-8-9 “ 2 4 • 3 3 • 5' 


1 In a linear sequence abed, ■ ■ -, any single, simple transposition of adjacent elements yields an 
odd permutation of the original sequence: abed — bacd. Two such transpositions yield an even 
permutation. In general, an odd number of such interchanges of adjacent elements results in an 
odd permutation; an even number of such transpositions yields an even permutation. 
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Its value is rather small compared to any of its entries because of significant 
cancellations that are typical of Hilbert determinants. ■ 

In general, the nth-order determinant may be expanded as a linear com¬ 
bination of the products of the elements of any row (or any column) and the 
(n — l)-order determinants formed by striking out the row and column of the 
original determinant in which the element appears. This reduced array (2x2 
in this example) is called a minor. If the element is in the ith row and jth 
column, the sign associated with the product is (—l) l+J . The minor with this 
sign is called the cofactor. If My is used to designate the minor formed by 
omitting the ith row and the jth column and Gy is the corresponding cofactor, 
Eq. (3.10) becomes 


3 3 

D -s = J2(-V 1+i aj M l i = 

3 =1 3 =1 


(3.11) 


In this case, expanding along the first row, we have i — 1 and the summation 
over j, the columns. 

This Laplace expansion may be used to advantage in the evaluation of 
high-order determinants in which many of the elements are zero. 


EXAMPLE 3.1.4 


Expansion across the Top Row To find the value of the determinant 


0 1 0 0 

-10 0 0 
0 0 0 1 

0 0-10 


(3.12) 


we expand across the top row to obtain (upon striking the first line and second 
column) 


D = (—1) 1+2 ■ (1) 


-1 

0 

0 


0 0 

0 1 

-1 0 


(3.13) 


Again, expanding across the top row, we get (upon striking the first line and 
first column) 


D = (-1) ■ (-1) 1+1 • (-1) 


0 

1 


0 

1 

-1 

0 


-1 

0 


= 1. 


(3.14) 


It is straightforward to check that we obtain the same result expanding any 
other row or column. 
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Figure 3.3 

Shifting the Top 
Parallel to the 
Bottom Leaves 
Volume Unchanged 



Antisymmetry 


The determinant changes sign if any two rows are interchanged or 
if any two columns are interchanged. This follows from the even-odd 
character of the Levi-Civita e in Eq. (3.8). 2 

This property was used in Section 2.9 to develop a totally antisymmetric 
linear combination (a third rank tensor from three vectors). It is also fre¬ 
quently used in quantum mechanics in the construction of a many particle 
wave function that, in accordance with the (Pauli) exclusion principle, will 
be antisymmetric under the interchange of any two identical spin | particles 
(electrons, protons, neutrons, quarks, etc.). To summarize: 


• Asa special case of antisymmetry, any determinant with two rows equal or 
two columns equal equals zero. 

• If each element in a row or each element in a column is zero, the determinant 
is equal to zero (seen by expanding across this row or column). 

• If each element in a row or each element in a column is multiplied by a 
constant, the determinant is multiplied by that constant (scaling up the 
volume with one of its sides). 

• The value of a determinant is unchanged if a multiple of one row is added 
(column by column) to another row or if a multiple of one column is added 
(row by row) to another column. 3 

• Asa special case, a determinant is equal to zero if any two rows are propor¬ 
tional or any two columns are proportional. 


EXAMPLE 3.1.5 


Rules on Determinants To illustrate the next to last point, which derives 
from the linearity and additivity of determinants, D(a 1 +A:a , 1 ,...) = D{ ai,.. .) + 
kD{ a'j, ...) in terms of their column (or row) vectors, we show that 


a\ 

a 2 

d3 


a\ + ka 2 

a 2 

a 3 

b\ 

t>2 

b 3 

= 

bi + kb 2 

b 2 

b 3 

Cl 

C2 

C3 


ci + kc 2 

C2 

C3 


2 The sign reversal for the interchange of two adjacent rows (or columns) is an odd permutation. 
The interchange of any two rows is also an odd permutation. 

3 This derives from the geometric meaning of the determinant as the volume of the parallelepiped 
(ai x Si-z) ■ a 3 spanned by its column vectors. Pulling it to the side along the direction of ai (i.e., 
replacing a 3 —>• a 3 + ftai), without changing its height, leaves the volume unchanged, etc. This is 
illustrated in Fig. 3.3. 
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Using the Laplace development on the right-hand side, we obtain 


ai + ka 2 

a 2 

a 3 



a 2 

a 3 


a 2 

a 2 

a 3 

b\ + kb 2 

b 2 

b 3 

= 

bi 

b 2 

b 3 

+ k 

b 2 

b 2 

b 3 

Cl + kC-2 

C2 

c 3 


Cl 

c 2 

c 3 


c 2 

c 2 

cs 


Then, by the property of antisymmetry, the second determinant on the right- 
hand side of Eq. (3.16) vanishes because two columns are identical, verifying 
Eq. (3.15). 

Using these rules, let us evaluate the following determinant by subtracting 
twice the third column from the second and three times the third from the first 
column in order to create zeros in the first line: 


3 

2 

1 


3 

-3 1 

2-21 

1 


0 

0 

1 

2 

3 

1 

= 

2 

-3 1 

3-21 

1 

= 

-1 

1 

1 

1 

1 

4 


1 

-3-4 

1-2-4 

4 


-11 

-7 

4 




= 


1 1 

11 -7 

= 7+11 

= 

18. 

■ 




Some useful relations involving determinants or matrices appear in Exercises 
of Sections 3.2 and 3.4. 


EXAMPLE 3.1.6 


Solving Linear Equations Returning to the homogeneous Eq. (3.1) and 
multiplying the determinant of the coefficients by x\, then adding x 2 times 
the second column and x 3 times the third column to the first column, we can 
directly establish the condition for the presence of a nontrivial solution for 
Eq. (3.1): 



ai 

a 2 

a 3 


aiX\ 

a 2 

as 


aiX\ + a 2 x 2 + a 3 x 3 

a 2 

a 3 

X\ 

bi 

b 2 

bs 

= 

biX\ 

b 2 

b 3 

= 

bixi + b 2 x 2 + b 3 x 3 

b 2 

b 3 


Cl 

c 2 

c 3 


C\X\ 

c 2 

cs 


Ci^i + c 2 x 2 + C 3 X 3 

c 3 

c 3 


0 a 2 a 3 
0 b 2 b 3 

0 c 2 c 3 


= 0 . 


(3.17) 


Similar reasoning applies for x 2 and x 3 . Therefore, X\, x 2 , and x 3 must all be 
zero unless the determinant of the coefficients vanishes. Conversely [see 
text following Eq. (3.3)], we can show that if the determinant of the coefficients 
vanishes, a nontrivial solution does indeed exist. 
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If our linear equations are inhomogeneous—that is, as in Eq. (3.5) if 
the zeros on the right-hand side of Eq. (3.1) are replaced by 04, 64 , and C4, 
respectively—then from Eq. (3.17) we obtain, instead, 


X\ 


(I 4 CI2 d 3 

b 4 b 2 b 3 

C4 C2 C3 


d\ d2 CL3 

b\ b 2 b 3 , 

Cl C2 C3 


(3.18) 


which generalizes Eq. (3.6a) to n = 3 dimensions, etc. This is Cramer’s rule 
generalized to three dimensions, and it can be shown to hold in n dimensions 
for any positive integer n. If the determinant of the coefficients vanishes, the 
inhomogeneous set of equations has no solution, unless the numerators also 
vanish. In this case solutions may exist, but they are not unique (see Exercise 
3.1.3 for a specific example). ■ 


For numerical work, the determinant solution of Eq. (3.18) is exceedingly 
unwieldy. The determinant may involve large numbers with alternate signs, 
and in the subtraction of two large numbers the relative error may soar to a 
point that makes the result worthless. Also, although the determinant method 
is illustrated here with 3 equations and 3 unknowns, we might easily have 
200 equations with 200 unknowns, which, involving up to 200 ! terms in each 
determinant, pose a challenge even to high-speed electronic computers. There 
must be a better way. 

In fact, there are better ways. One of the best is a straightforward pro¬ 
cess often called Gauss elimination. To illustrate this technique, consider the 
following set of linear equations: 


EXAMPLE 3.1.7 


Gauss Elimination Solve 


3x + 2y + z = 11 


2x + 3 y + z= 13 
x+ y + 4z = 12. 


(3.19) 


The determinant of the inhomogeneous linear equations [Eq. (3.19)] is obtained 
as 18 by subtracting twice the third column from the second and three times 
the third from the first column (see Example 3.1.5): 


CO 

2 

1 


2 

3 

1 

= 

1 

1 

4 



0 

0 

1 


-1 1 

-1 

1 

1 

= 

-11 -7 

-11 

-7 

0 




= 7+ 11 = 18, 


so that a solution exists. 

For convenience and for the optimum numerical accuracy, the equations 
are rearranged so that the largest coefficients run along the main diagonal 
(upper left to lower right). This has already been done in Eq. (3.19). 
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In the Gauss technique, the first equation is used to eliminate the first 
unknown, x, from the remaining equations. Then the (new) second equation is 
used to eliminate y from the last equation. In general, we work down through 
the set of equations; then, with one unknown determined, we work back up to 
solve for each of the other unknowns in succession. 

Dividing each row by its initial coefficient, we see that Eqs. (3.19) become 

2 1 11 

3 1 13 

x +-y+2 z =Y C 3 - 20 ) 

x+ y+ 4z = 12. 

Now, subtracting the first equation, we eliminate x from the second and third: 

2 1 11 

* + 3 y+ 3*=¥ 

5 1 17 

-y+-Z=- (3.21) 

ODD 

1 11 25 

3 Z ~ 3 

and then normalize the ^-coefficients to 1 again: 

2 1 11 

X+ 3 y+ S Z= J 

y+ ( 3 - 22 ) 

y+ 11 z — 25. 

Repeating the technique, we use the new second equation to eliminate y 
from the third equation: 

2 1 11 

x+ 3 y+ r= s 

V + = "g - (3.23) 

54 z = 108, 


or 


z= 2. 

Finally, working back up, we get 

1 17 

»+ 5 x2 = ¥' 
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or 


y = 3. 

Then with z and y determined, 

2 1 11 

3 ;+ - x 3 + - x 2 = — 

3 3 3 

and 


x=\. 


The technique may not seem as elegant as Eq. (3.18), but it is well adapted to 
computers and is far faster than using determinants. 

This Gauss technique is used to convert a determinant into triangular 
form: 


«i Pi Yi 


D = 


0 

0 


P 2 Y2 
o Y3 


(3.24) 


for a third-order determinant. In this form D = uiPzYz- For an nth-order deter¬ 
minant the evaluation of the triangular form requires only n— 1 multiplications 
compared with the n\ terms required for the general case. 

A variation of this progressive elimination is known as Gauss-Jordan elim¬ 
ination. We start as with the preceding Gauss elimination, but each new equa¬ 
tion considered is used to eliminate a variable from all the other equations, not 
just those below it. If we had used this Gauss-Jordan elimination, Eq. (3.23) 
would become 

1 7 

* + 5*=5 

y+ -z= — (3.25) 

«= 2 , 

using the second equation of Eq. (3.22) to eliminate y from both the first and 
third equations. Then the third equation of Eq. (3.23) is used to eliminate z 
from the first and second, giving 

x =1 

y = 3 (3.26) 

2 = 2 . 


Let us compare the Gauss method with Cramer’s rule, where we have to 
evaluate the determinants 


D = 


3 2 1 
2 3 1 

1 1 4 


= 18 
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from above, then D i with the inhomogeneous column vector replacing the first 
column of D, 



11 

2 

1 


11 

2 

1 

II 

13 

3 

1 

= 

2 

1 

0 


12 

1 

4 


12 

1 

4 


= (2- 12)+ 4(11-4)= 18, 


2 

12 


11 

2 


2 

1 


where we subtract the first from the second row and then expand the deter¬ 
minant along the last column. Hence, from Cramer’s rule, x = D\/D = 1. 
Next, proceeding with D 2 the same way, we obtain 


CO 

11 

1 


3 

11 

1 

2 

13 

1 

= 

-1 

2 

0 

1 

12 

4 


1 

12 

4 


= (-12-2)+ 4(6+ 11) = 54 


-1 

2 

+ 4 

3 

11 

1 

12 

-1 

2 


so that y = I)>/1) = 54/18 = 3. Finally, 



3 2 11 


3 -1 11 

II 

CO 

0 

2 3 13 

= 

2 1 13 


1 1 12 


1 0 12 


= (-13-11)+12(3+ 2) = 36, 


11 

13 



-1 

1 


where we subtract the first from the second row and expand D 3 along the last 
line so that z = D 3 /D = 36/18 = 2. ■ 


We return to the Gauss-Jordan technique in Section 3.2 on inverting ma¬ 
trices. Another technique suitable for computer use is the Gauss-Seidel itera¬ 
tion technique. Each technique has advantages and disadvantages. The Gauss 
and Gauss-Jordan methods may have accuracy problems for large determi¬ 
nants. This is also a problem for matrix inversion (Section 3.2). The Gauss- 
Seidel method, as an iterative method, may have convergence problems. The 
Gauss-Seidel iterative method and the Gauss and Gauss-Jordan elimination 
methods are discussed in considerable detail by Ralston and Wilf and also 
by Pennington. 4 Computer codes in FORTRAN and other programming lan¬ 
guages and extensive literature on the Gauss-Jordan elimination and others 
are also given by Press et al . 5 Symbolic mathematical manipulation computer 
software, such as Mathematica, Maple, Reduce, Matlab, and Mathcad, include 
matrix diagonalization and inversion programs. 


4 Ralston, A., and Wilf, H. (Eds.) (1960). Mathematical Methods for Digital Computers. Wiley, 
New York; Pennington, R. H. (1970). Introductory Computer Methods and Numerical Analysis. 
Macmillan, New York. 

B Press, W. H., Flannery, B. P., Teukolsky, S. A., and Vetterling, W. T. (1992). Numerical Recipes, 
2nd ed., Chapter 2. Cambridge Univ. Press, Cambridge, UK. 
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SUMMARY 


Linear equations are formally solved by ratios of determinants, that is, Cramer’s 
rule. However, a systematic elimination of variables leads to a solution much 
more efficiently, and computer codes are based on this Gauss method. 


Biographical Data 

Leibniz, Gottfried Wilhelm von. Leibniz, a German mathematician, phi¬ 
losopher, and physicist, was bom in Leipzig, Saxony (now Germany) in 1646 
and died in Hannover in 1716. Son of a professor of philosophy, he was a child 
prodigy whose universal talents remained at the genius level throughout his 
life. He is thought by many to have been the last person knowledgeable in all 
fields of human scholarship. He taught himself Latin at age 8, Greek at age 
14, and obtained a law degree in 1665. He studied mathematics and physics 
with the Dutch physicist C. Huygens. In 1667, he started a logical symbolism 
that was a precursor of Boolean logics. In 1671, he devised a calculating 
machine superior to that of the French mathematician Blaise Pascal. When 
he visited the English physicist Boyle in London in 1673, he was promptly 
elected a member of the Royal Society. In 1684, he published his version of 
differential and integral calculus, which eventually drew him into a nasty 
priority fight with Newton, each all but accusing the other of plagiarism 
(which from Leibniz’s superior notations and approach and Newton’s famous 
applications to gravity was practically mled out). In 1693, he recognized 
the conservation law of mechanical energy. Using variational techniques in 
philosophy, he tried to prove that “ours is the best of all possible worlds,” for 
which he was satirized by the French rationalist Voltaire. Determinants are 
one of his minor achievements. In 1714, the Duke of Hannover became Kang 
George I of England but left Leibniz there to die neglected and forgotten: 
Like Newton, he had never married and had no family. 


EXERCISES 

3 . 1.1 Find the currents of the circuits in Fig. 3.4. 

3 . 1.2 Test the set of linear homogeneous equations 

a; + 3y + 3s = 0, x — y + z = 0, 2x+ y+3z= 0 

to see if it possesses a nontrivial solution and find one. 

3 . 1.3 Given the pair of equations 

x + 2y=3, 2x + 4y=6, 

(a) show that the determinant of the coefficients vanishes; 

(b) show that the numerator determinants [Eq. (3.6)] also vanish; 

(c) find at least two solutions. 

3 . 1.4 Express the components of A x B as 2 x 2 determinants. Then show 
that the dot product A • (A x B) yields a Laplacian expansion of a 3 x 3 






3.1 Determinants 


173 


Figure 3.4 


Electric Circuits 
(a), (b), (c) 



determinant. Finally, note that two rows of the 3x3 determinant are 
identical and hence A • (A x B) = 0. 

3.1.5 If Cij is the cofactor of element a# [formed by striking out the ith row 
and jth column and including a sign (—1)' 1 '], show that 

(a) a ijCij = UjiCji = det(A), where det(A) is the determinant 
with the elements ay, 

(b) ; (iijC lk = djiCki — b? j ~f~ k- 

3.1.6 A determinant with all elements of order unity may be surprisingly small. 
The Hilbert determinant H y = (i + j — l)" 1 , i, j — 1, 2,..., n is noto¬ 
rious for its small values. 

(a) Calculate the value of the Hilbert determinants of order n for 
n = 1, 2, and 4. See Example 3.1.3 for the case n = 3. 
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(b) Find the Hilbert determinants of order n for n = 5 and 6. 

ANS. n Det(fT„) 

1 I 

2 8.33333 x 10~ 2 

3 4.62963 x 1(T 4 

4 1.65344 x 1(T 7 

5 3.74930 x 10- 12 

6 5.36730 x 10~ 18 

3 . 1.7 Solve the following set of linear simultaneous equations. Give the results 

to five decimal places or as rational numbers and check the solution using 
symbolic software (if available): 


l.Oxi 

+ 

0.9x2 

+ 

o 

bo 

03 

+ 

1 

o 

+ 

O.IX 5 



= 1.0 

0.9xi 

+ 

1 . 0 X 2 

+ 

o 

03 

+ 

0.5x4 

+ 

0 . 2 x 5 

+ 

0.1X6 = 

= 0.9 

£ 

00 

© 

+ 

0 . 8 x 2 

+ 

I.OX 3 

+ 

0.7x4 

+ 

0.4x5 

+ 

0 . 2 x 6 = 

= 0.8 

0.4xi 

+ 

0.5x 2 

+ 

0.7x3 

+ 

1 . 0 X 4 

+ 

0 . 6 x 5 

+ 

0 

CO 

£ 

11 

= 0.7 

O.lxi 

+ 

0 . 2 x 2 

+ 

0.4x3 

+ 

0 . 6 X 4 

+ 

I.OX 5 

+ 

0.5x6 = 

= 0.6 



0 . 1 x 2 

+ 

© 

bo 

03 

+ 

0.3x 4 

+ 

0.5x5 

+ 

1.0X6 - 

= 0.5. 


3.2 Matrices 


We can write the linear Eqs. (3.1), (3.5), and (3.19), more elegantly and com¬ 
pactly by collecting the coefficients in a 3 x 3 square array of numbers called 

a matrix: 


so that 


Gq 

a 2 

a 3 \ 


bi 

b 2 


(3.27) 

Cl 

C2 

c 3 ) 





(3.28) 


for the inhomogeneous Eqs. (3.19). This concept will be useful if we define 
appropriate ways of dealing with these arrays of numbers. Note that a matrix 
does not have a numerical value like a determinant. The entries, or individual 
matrix elements, need not be numbers; they can also be functions. 


Basic Definitions, Equality, and Rank 


A matrix is defined as a square or rectangular array of numbers that obeys 
certain laws. This is a perfectly logical extension of familiar mathematical con¬ 
cepts. In arithmetic, we deal with single numbers. In the theory of complex 
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variables (Chapter 6), we deal with ordered pairs of numbers as (1, 2) = 1 + 2 i, 
called complex numbers, in which the ordering is important. We now consider 
numbers (or functions) ordered in a square or rectangular array. For conve¬ 
nience in later work the entry numbers, or matrix elements, are distinguished 
by two subscripts, the first indicating the row (horizontal) and the second in¬ 
dicating the column (vertical) in which the number appears. For instance, ai 3 
is the matrix element in the first row, third column. Hence, if A is a matrix with 
to rows and n columns, that is, an m x to array, 


A = 


/ Oil «12 
<*21 «22 


®1m ^ 
d 2 n 


\ dml dm2 ' ' ' d mn J 


(3.29) 


Perhaps the most important fact is that the elements a y are not combined with 
one another. A matrix is not a determinant. It is an ordered array of numbers, 
not a single number. For a n x n matrix, n is its order. 

The matrix A, so far just an array of numbers, has the properties we assign to 
it. Literally, this means constructing a new form of mathematics. We postulate 
that matrices A, B, and C, with elements «y, by, and Cy, respectively, combine 
according to the following rules. 

Two matrices A = B are defined to be equal if and only if ay = by for 
all values of i and j. This, of course, requires that A and B each be to by n (or 
to x n) arrays (to rows, n columns). 

Looking back at the homogeneous linear Eq. (3.1), we note that the matrix 
of coefficients, A, is made up of three row vectors that each represent one linear 
equation of the set. If their triple scalar product is not zero, they span a nonzero 
volume, are linearly independent, and the homogeneous linear equations have 
only the trivial solution. In this case, the matrix is said to have rank 3. In n 
dimensions the volume represented by the triple scalar product becomes the 
determinant, det(A), for a square matrix. If det(A) ^ 0, the n x n matrix A has 
rank n. The case of Eq. (3.1), where the vector c lies in the plane spanned 
by a and b, corresponds to rank 2 of the matrix of coefficients because only 
two of its row vectors (a, b corresponding to two equations) are independent. 
In general, the rank to of a matrix is the maximal number of linearly 
independent row vectors it has; with 0 <m<n. 


Matrix Multiplication, Inner Product 


We now write linear equations in terms of matrices because their coefficients 
represent an array of numbers that we just defined as a matrix, and the column 
vector of unknown x, we recognize as a 3 x 1 matrix consisting of three rows 
and one column. In terms of matrices, the homogeneous linear equations take 
the form 


a\ 

a 2 

a 3 \ /xi 

bi 

b 2 

b 3 \ \ x 2 

Cl 

c 2 

C 3 ) 1^3 


( aixi + a 2 x 2 + a 3 x 3 \ 
biXi + b 2 x 2 + b 3 x 3 = 0, 

C\Xi + c 2 x 2 + c 3 x 3 ) 


(3.30) 
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which is a 3 x 3 matrix A multiplied by a single column vector x, by forming 
scalar products of the row vectors of the matrix with the column vector of the 
unknown .x;,;. The inhomogeneous Eqs. (3.19) can be written similarly 


/ 3 2 
2 3 


1\ 


( X\ \ 

x 2 


\i 1 4 / w 


( &ri + 2a; 2 + 0:3^ 


/ n\ 

2x\ + 3^ + x 2 

= 

13 

^1 +x 2 + 4^3 y 


\ 12/ 


and Eq. (3.5) can be formulated as 


/ aq a 2 \ / xi \ / aixi + a 2 x 2 \ _ / a 3 \ 

v & 1 b 2 J “ ybm + b 2 x 2 ) - \b 3 ) ’ 


(3.31) 


(3.32) 


where we label the array of coefficients a 2 x 2 matrix A consisting of two rows 
and two columns and consider the vector x as a 2 x 1 matrix. 

We take the summation of products in Eq. (3.30) as a definition of 
matrix multiplication involving the scalar product of each row vector 
of A with the column vector x. Thus, in matrix notation Eq. (3.30) becomes 


x' = Ax. 


(3.33) 


Matrix analysis belongs to linear algebra because matrices may be used as 
linear operators or maps such as rotations. Suppose we rotate the Cartesian 
coordinates of a two-dimensional space as in Section 2.6 so that, in vector 
notation, 



X\ cos ip + x 2 sin <p 
—X\ sin <p + x 2 cos (p 


cos <p sm <p 
— sin cp cos <p 




(3.34) 


where we label the array of elements Oy a 2 x 2 matrix A consisting of two 
rows and two columns and consider the vectors x, x' as 2 x 1 matrices. 

To extend this definition of multiplication of a matrix times a column vector 
to the product of two 2x2 matrices, let the coordinate rotation be followed 
by a second rotation given by matrix B such that 

x" = Bx\ (3.35) 


Therefore, symbolically 


x" = BAx. 


(3.36) 


In component form, 


x. 


1 — bjjXj — by ajkXk — 'y^ I 'y^ i b^ajk J a*. 


j k 


k \ j 


(3.37) 











3.2 Matrices 


177 


Thus, the summation over j is matrix multiplication defining a matrix C = B A 
such that 

2 

x" = Y Cik %k, (3.38) 

k= 1 

or x" = Cx in matrix notation, where the elements of C are given by 

(‘ik — ^ ' bijCkjk- 
j 

Again, this definition involves the scalar products of row vectors of B with 
column vectors of A. This definition of matrix multiplication generalizes to 
rectangular m x n matrices but makes sense only when the number to of 
columns of A equals the number of rows of B. This definition is useful and 
indeed this usefulness is the justification for its existence. 

The geometrical interpretation is that the matrix product of the two ma¬ 
trices, BA, is the rotation that carries the unprimed system directly into the 
double-primed coordinate system. (Before discussing formal definitions, the 
reader should note that a matrix A as an operator is described by its effect on 
the coordinates or basis vectors. The matrix elements aij constitute a repre¬ 
sentation of the operator that depends on the choice of coordinates or a basis.) 
From these examples we extract the general multiplication rule: 

AB = C if and only if Cy = Y, a^A/y. (3.39) 

k 

The ij element of C is formed as a scalar product of the ?'th row of A with the jth 
column of B [which demands that A have the same number of columns (n) as 
B has rows]. The dummy index k takes on all values 1, 2,..., n in succession, 
that is, 

dj = a^bij + di 2 b2j + avzbzj (3.40) 

for n = 3. Obviously, the dummy index k may be replaced by any other symbol 
that is not already in use without altering Eq. (3.39). Perhaps the situation 
may be clarified by stating that Eq. (3.39) defines the method of combining 
certain matrices. This method of combination is called matrix multiplication 
because, as emphasized earlier, Eq. (3.39) is useful. 


EXAMPLE 3.2.1 


Matrix Multiplication To illustrate, consider two (so-called Pauli) matrices 


er l 



and 




The ii element of the product, (eri<73)n, is given by the sum of the products 
of elements of the first row of ay with the corresponding elements of the first 
column of a 2 \ 



0 ■ 1 + 1 • 0 = 0 ; 
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that is, writing cri in terms of two row vectors ai, a 2 and a ;| in terms of two 
column vectors bi, b 2 , we have 


CT1CT3 



a bi ai b 2 
a 2 bi a 2 • b 2 


/O-l + l-O 0-0+l-(-l)\ _ /0-l\ 

~\1- l + o-o 1-0 + 0- OJ- 

Here, 


(pi^a)ij = Oi)„ (+i)ii + (o'i) i2 (^3) 2i 


Thus, direct application of the definition of matrix multiplication shows that 



and therefore 


u 3 ui = —CT 1 CT 3 . 


Dirac Bra-ket, Transposition 


The vector of unknown Xi in linear equations represents a special case of a 
matrix with one column and n rows. A shorthand notation, Dirac’s ket 6 x ), 
is common for such a column vector with components Xi,i= 1, 2, ..., n. If A 
is an n x n matrix and \x) an ^component column vector, A\x) is defined as 
in Eqs. (3.30) and (3.32). Similarly, if a matrix has one row and n columns, it 
is called a row vector, (x\, with components i = 1,2,..., to. Clearly, (x\ 
and \x) are related by interchanging rows and columns, a matrix operation 
called transposition, and for any matrix A, A is called A transpose, 7 with 
matrix elements (A)* = For example, 

/ 3 5 0\ r / 3 


2 

V 1 


1 

4/ 


5 


1\ 

1 

4/ 


Transposing a product of matrices AB reverses the order and gives BA; sim¬ 
ilarly, A|a;) transpose is (.r| A. The scalar product of vectors takes the form 

{%\y) = HiXiVi- 

(For the determinant |A| of an orthogonal matrix A, we have |A| = |A| 
because of the symmetric role that rows and columns play for determinants.) 


Multiplication (by a Scalar) 


We can multiply each linear Eq. (3.30) by a number, a, which means that 
each coefficient is multiplied by a. This suggests that we complement the 
multiplication by the following rule: The multiplication of a matrix A by 


6 The Dirac’s bra (x\ and ket \x) are abstract vectors. Only when we choose a basis of coordinates 
x\, X2, ■. ■, x n , as we have here, are the components Xi = (i\x). In a complex space, x£ = (x| i). 

7 Some texts denote A transpose by A r . 








3.2 Matrices 


179 


the scalar quantity a is defined as 

a?A — (ctOy), (3.41) 

in which the elements of a A are ucbif, that is, each element of matrix A is 
multiplied by the scalar factor. This rule is in striking contrast to the behavior 
of determinants, in which the factor a multiplies only one column or one row 
and not every element of the entire determinant. A consequence of this scalar 
multiplication is that 

a A — Act, commutation. 

If A is a square matrix, we can evaluate its determinant det(A) and compare 
a det(A) with 

dct(«A) = a" det(A), 

where we extract a factor a from each row (or column) of a A. 


Addition 


We also need to add and subtract matrices according to rules that, again, are 
suggested by and consistent with adding and subtracting linear equations. 
When we deal with two sets of linear equations in the same unknowns that we 
can add or subtract, then we add or subtract individual coefficients keeping 
the unknowns the same. This leads us to the definition 

A + B = C if and only if «y + by = cy for all values of i and j, (3.42) 

with the elements combining according to the laws of ordinary algebra (or 
arithmetic if they are simple numbers). This means that 

A + B = B + A, commutation. (3.43) 


EXAMPLE 3.2.2 


Matrix Addition To illustrate matrix addition, let us add the Pauli matrices 
oq and (73 of Example 3.2.1: 


(0 1\ /1 0\_/0+l l + 0\ 

\1 OJ \ 0 -l)^\l + 0 0-1 J 



Also, an associative addition law is satisfied: 

(A + B) + C = A + (B + C). (3.44) 

If all elements are zero, the matrix is called the null matrix and is denoted by 
0. For all A, 

A+0 = 0 + A = A, (3.45) 

with 

/O 0 0 • • A 

0 0 0 ... 

0 0 0 ■ • ■ 

V./ 


(3.46) 
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Such m x n matrices form a linear space with respect to addition and 
subtraction. 

Except in special cases, matrix multiplication is not commutative. 8 


AB ^ BA. 

However, from the definition of matrix multiplication 9 we can show that an 
associative law 

(AB)C = A(BC) 

holds. There is also a distributive law: 


A(B + C) = AB + AC. 


The unit matrix 1 has elements <5y, Kronecker delta, and the property that 
1A = A1 = A for all A, 


1 


/I 0 0 0 • • A 

0 1 0 0 • • • 

0 0 1 0 • • • 

0 0 0 1 ■ ■ ■ 

V./ 


(3.47) 


It should be noted that it is possible for the product of two matrices to be the 
null matrix without either one being the null matrix. For example, if 


A = 



and 



A B = 0. This differs from the multiplication of real or complex numbers, which 
form a field, whereas the additive and multiplicative structure of matrices is 
called a ring by mathematicians. 


Product Theorem 


An important theorem links the determinant of a product of square matrices 
to the product of their determinants. Specifically, the product theorem states 
that the determinant of the product, det(AB), of two n x n matrices A and 
B is equal to the product of the determinants, det(A) det(B). 

To prove the product theorem, consider the n column vectors 


c k = 



Commutation or the lack of it is conveniently described by the commutator bracket symbol, 
[A, B] m AB - BA. AB ^ BA becomes [A, B] / 0. 

B Note that the basic definitions of equality, addition, and multiplication are given in terms of 
the matrix elements, the ay. All our matrix operations can be carried out in terms of the matrix 
elements. However, we can also treat a matrix as an algebraic operator, as in Eq. (3.33). Matrix 
elements and single operators each have their advantages, as will be seen in the following section. 
We shall use both approaches. 
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of the product matrix C = AB for k = 1, 2,..., to. Each transpose column 
vector can be displayed as a row vector 


c k 


( a ljbjk> a 2;jbjk, ■ ■ ■ Clnjbjic 

\ j i j 


y bj k k&j k ■ 

jk 


Thus, each column vector 


c *— y] bjkk&jt 

jk 

is a sum of n column vectors a j k — (ckj k , i = 1,2,... ,ri). Note that we are now 
using a different product summation index jk for each column vector C/,-. Since 
any determinant 


D(&iai + b 2 a 2 ) = bi-D(ai) + b 2 Z)(a 2 ) 

is linear in its column vectors, we can pull out the summation sign in front 
of the determinant from each column vector in C together with the common 
column factor b; )t u so that 

det(C) = J2 bj, i b j2 2 ■ ■ ■ b jnn det(a A a^,..., a iri ). (3.48) 

j k 

If we rearrange the column vectors a j k of the determinant factor in Eq. (3.48) in 
the proper order 1 , 2,... to , then we can pull the common factor det(ai, a 2 , ..., 
a„) = | A| in front of the to summation signs in Eq. (3.48). These column per¬ 
mutations generate just the right sign Sj a - 2 ...j n to produce in Eq. (3.48) the 
expression in Eq. (3.8) for det(B) so that 

det(AB) = det(C) = det(A) ^ £jd 2 -j„bj,-ib h 2 ■ ■ ■ b jn „ = det(A) det(B). 

j k 

(3.49) 


EXAMPLE 3.2.3 


Determinant of Matrix Product Consider 

a j - i/2 -^y bJ - 1 ^ 

\V3/2 -1/2 / \-V3 -1/ 


Then det(A) = 1/4 + 3/4 = 1 and det(B) = 1 + 3 = 4, whereas the matrix 
product is 

/ —1/2 —V3/2\ / -1 V3\ 

\V3/2 -1/2/ \-/3 -1/ 

/ 1/2+ 3/2 -V3/2 + /3/2\_ /l o\ 

V-V3/2 + /3/2 3/2 +1/2/ \° 


AB = 
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The determinant of the unit matrix is 1, of course. The product theorem gives 


det(AB) = det(A) det(B) = 2 2 


0 

1 


= 4. ■ 


Direct Product 


A second procedure for multiplying matrices is known as the direct tensor or 
Kronecker product. If A is an to x to matrix and B an nx n matrix, then the 
direct product is 


A <g> B = C. (3.50) 

C is an mn x mn matrix with elements 

C a p — AijB kl , (3.51) 


with 


a = n(i — 1) + k, ft = n(j — 1) + 1. 


EXAMPLE 3.2.4 


A Direct Product 


If A = ^ ““ ““ J and B = ^ j are both 2x2 matrices, 




( aii&n 

011&12 

012&11 

Ol2&12 \ 

ffluB 

Ol2 B \ 

OH&21 

011&22 

Ol2&21 

Ol2&22 

0-21B 

022 B / 

021&11 

0-21^12 

022^11 

022^12 



V021&21 

«21 &22 

022&21 

022^22 / 


(3.52) 


The direct product is associative but not commutative. Examples appear 
in the construction of groups (see Chapter 4) and in vector or Hilbert space in 
quantum theory. 

Direct Sum of Vector Spaces If the vector space V) is made up of all linear 
combinations of the vectors ai, ai, ..., a m and V 2 of b 1( bi,..., b n , then their 
direct sum 4) + V> consists of all linear combinations of the a, and b r 


;m| Diagonal Matrices 

An important special type of matrix is the square matrix in which all the non¬ 
diagonal elements are zero; it is called diagonal. Specifically, if a 3 x 3 matrix 
A is diagonal, 


A = 


( a n 
0 

V° 


0 0 \ 

a-22 0 

0 «33 y 
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Obvious examples are the unit and zero matrices. A physical interpretation of 
such diagonal matrices and the method of reducing matrices to this diagonal 
form are considered in Section 3.5. Here, we simply note a significant property 
of diagonal matrices: Multiplication of diagonal matrices is commutative, 

AB = BA, if A and B are each diagonal. 


Multiplication by a diagonal matrix [d\, (h, • ■ ■, d n ] that has only nonzero 
elements in the diagonal is particularly simple, 


1 0 
0 2 


1 2 
3 4 


1 2 
2-3 2-4 


whereas the opposite order gives 

1 2\/l 0\_/l 2-2\ 

3 4JV0 2J-V3 2 • 4y ' 


Thus, a diagonal matrix does not commute with another matrix unless 
both are diagonal, or the diagonal matrix is proportional to the unit 
matrix. This is borne out by the more general form 


[d\, <k, ■ ■ •, d n \ A = 


/ di 0 

• 0\ 

0 d 2 • 

■ 0 

V 0 0 • 

' d n ) 


( On ®12 ' ' ' «1 n ^ 

^21 «22 • ' ' «2 n 

\ a„i a«2 * * * u »ri J 


/ dian diai2 ■ ■ • dia-i M 

dzClzi doClzZ ‘ ‘ ' dza^n 


whereas 



V dn&nl dn@'f)2 

• • • dnQ"im/ 



( &11 &12 

aiO 

/ d 1 0 • 

' 0 \ 

A[di, d 2 ,..., d n ] = 

&21 &22 ‘ ‘ 

^2 n 

0 d 2 ■ 

• 0 


^ &ri\. 0"ii2 

C^nn ) 

v 0 0 ■ 

dn j 


^ dian c?2®i2 ■ • • d n a ,\ n '' 

dia21 dz&22 ■ • • d n a2n 


\ d\ a -,,1 dzcinz 


dndn 


In the special case of multiplying two diagonal matrices, we simply multiply 
the corresponding diagonal matrix elements, which obviously is commutative. 
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In any square matrix the sum of the diagonal elements is called the trace. 
For example, 


trace 



1 + 4 = 5. 


Clearly, the trace is a linear operation: 

trace(A — B) = trace(A) — trace(B). 


One of its interesting and useful properties is that the trace of a product of two 
matrices A and B is independent of the order of multiplication: 


trace(AB) = ^(AB), ;i = ^ ^ 

= ££%<“* = D BA >- (3 53) 

3 i 3 

— trace(BA). 

This holds even though AB ^ BA. [Equation (3.53) means that the trace of any 
commutator, [A, B] = AB — BA, is zero.] From Eq. (3.53) we obtain 

trace(ABC) = trace(BCA) = trace(CAB), 

which shows that the trace is invariant under cyclic permutation of the 
matrices in a product. 


Matrix Inversion 


We cannot define division of matrices, but for a square matrix A with nonzero 
determinant we can find a matrix A -1 , called the inverse of A, so that 10 


A _1 A = AA^ 1 = 1. 


(3.54) 


This concept has immediate applications: If the matrix A represents a linear 
transformation, such as a rotation of the coordinate axes, we would expect the 
inverse transformation that restores the original coordinate axes to be given by 
A -1 . If the matrix A describes a linear coordinate transformation, the inverse 
matrix A -1 allows us to formally write 

| a;) = A -1 |a/). (3.55) 

That is, A -1 describes the reverse of the transformation given by A and returns 
the coordinate system to its original position. 


10 Here and throughout this chapter, our matrices have finite rank. If A is an infinite rank matrix 
(n x n with n -> oo), then life is more difficult. For A^ 1 to be the inverse we must demand that 
both 

AA" 1 = 1 and A~ : A = 1. 

One relation no longer implies the other. 
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EXAMPLE 3.2.5 


If A is the matrix of coefficients of a system of inhomogeneous linear equa¬ 
tions A| x) = | b), then the inverse matrix allows us to write the formal solution 
as |a;) — A l \b). These examples indicate that the concept of inverse matrix is 
useful. 


Inverse 2x2 Matrix The general 2x2 matrix 

A = ( an ® 12 ^ 

V «21 a -22 / 

has the inverse 

^-1 _ 1 / «22 — Ot -12 \ 

det(A) \ — « 2 i an / 

if and only if det(A) = a n a ,22 — ai 2 » 2 i f 0. To verify this, we check that 

1 ( a U °tl2 \ { ®22 — ®12 \ 

det(A) \ «21 «22 / V — 0-21 «11 / 

1 / a n a 2 2 - <212^21 -ana-12 + «i2an \ _ /1 0 

det(A) \ffl 2 lffl 22 — « 22 ffl 21 —a- 21«-12 + a 22 «ll / \0 1 



Inverse of Diagonal Matrix The inverse of a diagonal matrix is also 
diagonal with inverse matrix elements along the diagonal: 


\d \, cfe, .. ■, d n ] 


" 1 1 

_d[’ ck’ 




( d\ 0 ■ ■ ■ 

0 \ 

(k 0 •• 

0 \ 

1 ■ 


0 d 2 • • • 

0 

0 

di 

0 

d n _ 









v 0 0 ••• 

dn ) 

\ o o ■■ 

J, 

dn ' 



(% 0 

°\ 


(l o ■ 

■■ 0 \ 


— 

0 1 

0 

— 

0 1 • 

■■ 0 



v 0 0 

i 

d n / 


v o 0 • 

•• 1 / 


Inverse of Product AB If A is an n x n matrix with determinant det(A) f- 0, 
then it has a unique inverse A -1 so that AA 1 = A 1 A = 1. If B is also an n x n 
matrix with inverse B -1 , then the product AB has the inverse 


(AB)" 1 = B _1 A _1 


(3.56) 


because ABB : A 1 = 1 = B 1 A 1 AB. 


Derivative of a Matrix and Its Inverse When the matrix elements (ijift) 
of A are functions of the variable t, we define the derivative of A as ^ = 
( d “dt )• T° define the derivative of the inverse matrix, we differentiate 
AA -1 = 1, obtaining 

d( AA -1 ) dA , dA _1 

0 = —-- = —A -1 + A-. 

dt dt dt 
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Multiplying this relation by A 1 on the left yields the formula 



dt 


dt 


The generalization of the inverse of a 2 x 2 matrix in Example 3.2.5 is given 
in Exercise 3.2.24, with the inverse matrix having matrix elements (A ' )y, 



(3.57) 


with the assumption that the determinant of A, |A| ^ 0. If it is zero, we label 
A singular. No inverse exists. However, as explained at the end of Section 3.1, 
this determinant form is totally unsuited for numerical work with large 
matrices. 

In Example 3.2.5, C\\ = a. 22 , C 12 = — 021 , C 21 = —< 112 , C 22 = an so that 
(A^ 1 )i 2 = C 21 /IA = —a\ 2 /\A\, etc., as noted in the example. 

There is a wide variety of alternative techniques. One of the best and 
most commonly used is the Gauss-Jordan matrix inversion technique (demon¬ 
strated in Example 3.2.6), which shows how to find matrices Mi such that the 
product Mi A will be A but with 

• one row multiplied by a constant, or 

• one row replaced by the original row minus a multiple of another row, or 

• rows interchanged. 

Other matrices M R operating on the right (AM/;) can carry out the same 
operations on the columns of A. 

This means that the matrix rows and columns may be altered (by matrix 
multiplication) as though we were dealing with determinants, so we can apply 
the Gauss-Jordan elimination techniques of Section 3.1 to the matrix elements. 
Hence, there exists a matrix M L (or M R ) such that 11 


M l A = 1. 


(3.58) 


Then Mi = A 1 . We determine M/, by carrying out the identical elimination 
operations on the unit matrix. Then 


M L 1 = M L . 


(3.59) 


To illustrate this, we consider a specific example. 



Gauss-Jordan Matrix Inversion We want to invert the matrix 


/ 3 2 1\ 


A = 2 3 1 


(3.60) 


V 1 1 4 7 


11 Remember that det(A) / 0. 
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corresponding to the simultaneous linear equations of Example 3.1.7. For con¬ 
venience we write A and 1 side by side and carry out the identical operations 
on each: 


to get a*! = 1: 


/ 3 

2 

i\ 


/ 1 

0 

°\ 

2 

3 

i 

and 

0 

1 

0 

V 1 

1 




0 

1/ 

multiply the first row by 1/3 and 

/I 

2 

3 

k) 


a 

0 

°\ 

1 

3 

2 

1 

2 

and 

0 

1 

2 

0 

V 

1 



U 

0 

i ) 


(3.61) 


(3.62) 


Subtracting the first row from the second and third, we obtain 


/1 

2 

3 

n 


/ 

1 

3 

0 

o\ 

0 

5 

6 

1 

6 

and 


1 

3 

1 

2 

0 

u 

1 

3 

ii i 
3 / 


l- 

1 

3 

0 

1 / 


(3.63) 


Then we divide the second row (of both matrices) by | and subtract g times 
it from the first row and | times it from the third row. The results for both 
matrices are 


/I o i\ 


0 1 


1 

5 

18 


/ 5 

K 


and 


\o o fj 


-1 °\ 

1 0 


\-S 


(3.64) 


1 / 


We divide the third row (of both matrices) by -|P. Then, as the last step, i times 
the third row is subtracted from each of the first two rows (of both matrices). 
Our final pair is 


/1 

0 

°\ 


( 

11 

18 

7 

18 

-h) 


0 

1 

0 

and A 1 = 


7 

18 

11 

18 

1 

18 

(3.65) 

v° 

0 

V 


\~ 

1 

18 

1 

18 

18 / 



The check is to multiply the original A by the calculated A -1 to see if we 
really do get the unit matrix 1. Instead, we check that the column vector of 
solution Xi 


( Xi 

x 2 

X 3 



11 11-7-13—12 


18 ' 


18 


' 18 ' 


7 11+11 13—12 

18 

= 

54 

18 

= 2) 

11-13 + 5-12 


?! 

V 2 / 


18 


18 . 


coincides with that of Example 3.1.7. ■ 
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SUMMARY 


As with the Gauss-Jordan solution of simultaneous linear algebraic equa¬ 
tions, this technique is well adapted to computers. Indeed, this Gauss-Jordan 
matrix inversion technique will be available in the program library as a FOR¬ 
TRAN code or in other programming languages as well as in Mathematica, 
Maple, and other symbolic computer manipulation software (see Sections 2.3 
and 2.4 of Press et al., 1992). 

For matrices of special form the inverse matrix can be given in 
closed form; for example, for 

( a b c\ 

b d b\ (3.66) 

c b e) 

the inverse matrix has a similar but slightly more general form 

(a Pi y\ 

A- 1 = \Pi & fa], (3.67) 

\ y fa e ) 

with matrix elements given by 

Da = ed — b 2 , Dy = —(cd - b 2 ), Dfii = (c — e)b, Dfa = (c — a)b, 

DS = ae — c 2 , De — ad — b 2 , D = b 2 (2c — a — e) + d(ae — c 2 ), 

where D = det(A) is the determinant of the matrix A. If e = a in A, then the 
inverse matrix A 1 also simplifies to 

ft] = @ 2 , e = a, D = ( a 2 — c 2 )d + 2(c — d)b 2 . 

As a check, let us work out the n matrix element of the product AA 1 = 1. 
We find 

aa + bp i + cy = — [a(ed — b 2 ) + b 2 (c — e) — c(cd — fr 2 )] 

= -^(— ab 2 + aed + 2 b 2 c - b 2 e - c 2 d ) = ^ = 1. 

Similarly, we check that the 12 matrix element vanishes, 

1 9 

aPi + bS + CP 2 = — [ab(c — e) + b(ae — c 2 ) + cb(c — a)] = 0, 

etc. 

Note that we cannot always find an inverse of A -1 by solving for the matrix 
elements a, b, ■■ ■ of A because not every inverse matrix A 1 of the form in 
Eq. (3.67) has a corresponding A of the special form in Eq. 3.66, as Example 3.2.6 
clearly shows. 


Matrices are square or rectangular arrays of numbers that define linear trans¬ 
formations like rotations of a coordinate system. As such, they are linear op¬ 
erators. Square matrices may be inverted when their determinant is nonzero. 
When a matrix defines a system of linear equations, the inverse matrix solves 
it. Matrices with the same number of rows and columns may be added and 
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subtracted. They form what mathematicians call a ring, with a unit and a zero 
matrix. Matrices are also useful for representing group operations and opera¬ 
tors in Hilbert spaces. 


EXERCISES 

3.2.1 Show that matrix multiplication is associative: (AB)C = A(BC). 

3.2.2 Show that 

(A + B)(A - B) = A 2 - B 2 
if and only if A and B commute, 

AB - BA = [A, B] = 0. 

3.2.3 Show that matrix A is a linear operator by showing that 

A(ciri + c 2 r 2 ) = CiAri + c 2 Ar 2 , 

where c ;/ are arbitrary constants and r ; are vectors. It can be shown 
that an n x n matrix is the most general linear operator in an n- 
dimensional vector space. This means that every linear operator in 
this /7-dimensional vector space is equivalent to a matrix. 

3.2.4 If A is an n x n matrix, show that 

det(-A) = (-ly'detA. 

3.2.5 (a) The matrix equation A 2 = 0 does not imply A = 0. Show that the 

most general 2x2 matrix whose square is zero may be written as 

/ ab b 2 \ 

^ -a 2 -ab ) ’ 

where a and b are real or complex numbers. 

(b) If C = A + B, in general 

det C ^ det A + det B. 

Construct a specific numerical example to illustrate this inequality. 

3.2.6 Verify the Jacobi identity 

[A, [B, C]] = [B, [A, C]] - [C, [A, B]]. 

This is useful in quantum mechanics and for generators of Lie groups 
(see Chapter 4). As a mnemonic aid, the reader might note that the 
Jacobi identity has the same form as the BAC-CAB rule of Section 1.5. 

3.2.7 Show that the matrices 

/0 1 0\ /0 0 0\ /0 0 1\ 

A = I 0 0 0 1 , B = 0 0 1 , C = I 0 0 0 ) 

\0 0 0/ \0 0 0/ \0 0 0/ 
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satisfy the commutation relations 

[A, B] = C, [A, C] = 0, and [B, C] = 0. 


These matrices can be interpreted as raising operators for angular 
momentum 1. 

3.2.8 A matrix with elements <i,j = 0 for j < i may be called upper right 
triangular. The elements in the lower left (below and to the left of the 
main diagonal) vanish. 

Show that the product of two upper right triangular matrices is an 
upper right triangular matrix. The same applies to lower left triangular 
matrices. 

3.2.9 The three Pauli spin matrices are 


oi = 




and (i 3 



Show that 

(a) of = 1, 

(b) OjOj = io k , (i, j, k) = (1, 2, 3), (2, 3, 1), (3, 1, 2) (cyclic permuta¬ 
tion), and 

(c) OjOj + ojoi = 2<5,j I 2 , where 1 2 is the identity 2x2 matrix. 

These matrices were used by Pauli in the nonrelativistic theory of 
electron spin. 

3.2.10 Using the Pauli o of Exercise 3.2.9, show that 

(er • a)(er ■ b) = a • b 1 2 + icr ■ (a x b). 


Here 


er = xai + y o 2 + 2.03 


and a and b are ordinary vectors and 1 2 is the 2x2 unit matrix. 

3.2.11 One description of spin 1 particles uses the matrices 


M r = -4= 

(0 1 0\ 

, M u = -4= 

/O -i 0\ 

1 0 1 

i 0 — i 

V2 

1° 1 0 ) 

’ ^ V2 

^0 i 0 j 


M 2 


/l 0 0\ 

0 0 0 

\o 0 - 1 / 


and 
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Show that 

(a) [M x , M (/ ] = iM z , and so on 12 (cyclic permutation of indices). Using 
the Levi-Civita symbol of Sections 2.9 and 3.4, we may write 

[Mj, My] = iSijk M/,-, 

which are the commutation relations of angular momentum. 

(b) M 2 = + M 2 + M 2 = 2 1 3 , 

where 1 3 is the 3x3 unit matrix. 

(c) [M 2 , M,] = 0, 

[M*, L+] = L+, 

[L + , L“] = 2M S , 
where 

L + = + iMy, 

L = M x - iU y . 

3.2.12 Repeat Exercise 3.2.11 using an alternate representation, 

/0 0 0 \ / 0 0 i\ 

M x = 0 0 -i , = 0 0 0 , 

\0 i 0 / \-i 0 0/ 

and 

/0 -i 0\ 

M z = [ i 0 0 . 

\0 0 0/ 

In Chapter 4 these matrices appear as the generators of the rotation 
group. 


3.2.13 Repeat Exercise 3.2.11 using the matrices for a spin of 3/2, 


/ 0 
V3 
0 

V 0 


V3 

0 

2 

0 


0 

2 

0 

V3 


and 


0 

0 

V3 

0 


M u — . 


/ 0 
V3 
0 
0 


-V3 

0 

2 

0 


0 

-2 

0 

a/3 


0 

0 

-V3 

0 


/3 

0 

0 


0\ 

0 

0 

-3/ 


3.2.14 An operator P commutes with J. /; and J (/ , the x and y components of 
an angular momentum operator. Show that P commutes with the third 
component of angular momentum; that is, 

[P, = o. 

Hint. The angular momentum components must satisfy the commuta¬ 
tion relation of Exercise 3.2.11(a). 


12 [A, B] = AB - BA. 
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3.2.15 The L + and L matrices of Exercise 3.2.11 are ladder operators (see 
Chapter 4): L + operating on a system of spin projection to will raise 
the spin projection to to + 1 if to is below its maximum. L + operating 
on rn max yields zero. L _ reduces the spin projection in unit steps in a 
similar fashion. Dividing by V2, we have 

/0 1 0\ /0 0 0\ 

L+ = 0 0 1 , L~ = 1 0 0 1 . 

\0 0 0 / \0 1 0 / 

Show that 

L + | — 1) = |0>, L — | — 1) = null column vector, 

L+|0> = |1>,L-|0> = |-1>, 

L + |l> = null column vector, L 11} = |0>, 

where 

°\ /OX 

0 ) ’ |0> = I 1 ) ’ and ^ = 

represent states of spin projection —1,0, and 1, respectively. 

Note. Differential operator analogs of these ladder operators appear in 
Exercise 11.6.7. 

3.2.16 Vectors A and B are related by the tensor T: 

B = TA. 

Given A and B show that there is no unique solution for the compo¬ 
nents of T. This is why vector division B/A is undefined (apart from 
the special case of A and B parallel and T then a scalar). 

3.2.17 We might ask for a vector A' 1 , an inverse of a given vector A in the 
sense that 

A • A^ 1 = A^ 1 ■ A = 1. 

Show that this relation does not suffice to define A -1 uniquely. A has 
literally an infinite number of inverses. 

3.2.18 If A is a diagonal matrix, with all diagonal elements different, and A 
and B commute, show that B is diagonal. 

3.2.19 If A and B are diagonal matrices, show that A and B commute. 

3.2.20 Angular momentum matrices satisfy commutation relations 

[M,-, My] = iM k , i, j, k cyclic. 

Show that the trace of each angular momentum matrix vanishes. 
Explain why. 





3.3 Orthogonal Matrices 


193 


3.2.21 (a) The operator tr replaces a matrix A by its trace; that is, 

tr(A) = trace(A) = ^ an. 

i 

Show that tr is a linear operator. 

(b) The operator det replaces a matrix A by its determinant; that is, 
det(A) = determinant of A. 

Show that det is not a linear operator. 

3.2.22 A and B anticommute. Also, A 2 = 1, B 2 = 1. Show that trace(A) = 
trace(B) = 0. 

Note. The Pauli (Section 3.4) matrices are specific examples. 


3.2.23 If a matrix has an inverse, show that the inverse is unique. 

3.2.24 If A -1 has elements 

(A-% = 

V A? | A | - 

where C A is the j /th cofactor of | A |, show that 

A~ X A = 1. 

Hence, A -1 is the inverse of A (if |A| ^ 0). 

Note. In numerical work it sometimes happens that |A| is almost equal 
to zero. Then there is trouble. 

3.2.25 Show that det A -1 = (det A) -1 . 

Hint. Apply the product theorem of Section 3.2. 

Note. If det A is zero, then A has no inverse. A is singular. 


3.2.26 Find the inverse of 


A = 


/ 3 2 
2 2 

V 1 1 


1\ 

1 

V 


3.2.27 Explain why the inverse process starting from a matrix A -1 of the form 
in Eq. (3.67) and solving for the matrix elements a, b, ■ ■ ■ of A does not 
always work. 


3.3 Orthogonal Matrices 


Ordinary three-dimensional space may be described with the Cartesian co¬ 
ordinates (x ), a',' 2 , x>.\). We consider a second set of Cartesian coordinates 
(x\, x' 2 , x' :l ), whose origin and handedness coincide with those of the first set 
but whose orientation is different (Fig. 3.5). We can say that the primed co¬ 
ordinate axes have been rotated relative to the initial, unprimed coordinate 
axes. Since this rotation is a linear operation, we expect a matrix equation 
relating the primed coordinates to the unprimed coordinates. 









194 


Chapter 3 Determinants and Matrices 


Figure 3.5 

Rotation of 
Cartesian 

Coordinate Systems 



This section repeats portions of Chapter 2 in a slightly different context 
and with a different emphasis. Previously, attention was focused on a vector 
or tensor. Transformation properties were strongly stressed and were critical. 
Here, emphasis is on the description of the coordinate rotation—the matrix. 
Transformation properties, the behavior of the matrix when the coordinate 
system is changed, appear at the end of this section. Sections 3.4 and 3.5 
continue with transformation properties of complex vector spaces. 


Direction Cosines 


A unit vector along the x\ -axis (X,) may be resolved into components along 
the X\ X2-, and ,r 3 -axes by the usual projection technique: 


x'j = X! cosO*^, xi) + x 2 cosOr'i, X2) + x 3 cosOr'i, X3). (3.68) 


Equation (3.68) is a specific example of the linear relations discussed at the 
beginning of Section 3.2. In two dimensions, this decomposition of a vector in 
rotated coordinates is illustrated in Fig. 2.18 in conjunction with Fig. 3.5. 

For convenience, these cosines, which are the direction cosines, are labeled 


cosGx,, X\) — x' : ■ xi = an, 


cos Or 1 , X 2 ) — xi • x 2 = ai 2 , 


(3.69) 


cosOr',, X 3 ) = x' : ■ x 3 = a i3 . 


Continuing, we have 


cosOrg, oc\) = Xg • xi = a 2 i, 
cos^, X 2 ) — Xg ■ x 2 = a 22 , 


(3.70) 
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and so on, where 021 ^ «i 2 in general. Now Eq. (3.68) may be rewritten as 

4 = xian + x 2 ai2 + x 3 ai 3 


and also 

4 = xia 2 i + x 2 a 22 + x 3 a 23 , 

/ ( 1 J 

x 3 = xia 3 i + x 2 a 32 + x 3 a 33 . 

We may also go the other way by resolving xi, x 2 , and x 3 into components 
in the primed system. Then 

Xi = x'a n + 4 ® 2 i + Xg«3i, 

x 2 = Xjttn + x' 2 a 2 2 + Xga 32 , (3.72) 

x 3 = xia 13 + X 2« 23 + Xga 33 . 

Associating xi and x\ with the subscript 1, x 2 and x) with the subscript 2, 
x 3 and Xg with the subscript 3, we see that in each case the first subscript of 
ctij refers to the primed unit vector (x' 1; x' 2 , x 3 ), whereas the second subscript 
refers to the unprimed unit vector (xi, x 2 , x 3 ). 


Applications to Vectors 


If we consider a vector whose components are functions of the position in 
space, then 


V(xi, X2, X3) = Xi Vi + x 2 y 2 + x 3 V 3 , 

V(»i, 4 ,4) = 44 + x'4 + x'4 (3.73) 

since the point may be given by both the coordinates (x \, x 2 , .x 3 ) and the co¬ 
ordinates (x\ , x' 2 , Xg). Note that V and V' are geometrically the same vector 
(but with different components). The coordinate axes are being rotated; the 
vector stays fixed. Using Eq. (3.72) to eliminate xi, x 2 , and x 3 , we may separate 
Eq. (3.73) into three scalar equations: 

V[ = aiiVi + ai 2 V 2 + a-i 3 V 3 , 

4 = azi Vi + a 22 V 2 + a 23 V 3 , (3.74) 

4 = a 3 i 4 + a 32 V 2 + a 33 V 3 . 

In particular, these relations will hold for the coordinates of a point (x\, x 2 , x 3 ) 
and (Xj , 4 , x'p, giving 

4 = anXi + a 12 x 2 + a-i 3 x 3 , 

4 = a 2 ixi + a 22 x 2 + a 23 x 3 , (3.75) 

x' 3 = a 3 ixi + a 32 x 2 + a 33 x 3 , 

and similarly for the primed coordinates. In this notation, the set of three 
equations may be written as 
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where i takes on the values 1, 2, and 3 and the result is three separate equa¬ 
tions, or it is just one matrix equation 

\x') = A|x) 

involving the matrix A, the column vector \x), and matrix multiplication. 

Now let us set aside these results and try a different approach to the same 
problem. We consider two coordinate systems (x\ , .x 2 , a&) and (x[ , x' 2 , x',) with 
a common origin and one point (x\, x 2 , x^) in the unprimed system, (x \, x' 2 , x' 3 ) 
in the primed system. Note the usual ambiguity. The same symbol x denotes 
both the coordinate axis and a particular distance along that axis. Since our 
system is linear, x\ must be a linear combination of x,j . Let 

3 

x'j = y ^ajjXj. (3.77) 

j= i 

The ckj may be identified as our old friends, the direction cosines. This identi¬ 
fication is carried out for the two-dimensional case later. 

If we have two sets of quantities (V\ , V 2 , Vj) in the unprimed system and 
(Vj', V 2 , Vg') in the primed system, related in the same way as the coordinates 
of a point in the two different systems [Eq. (3.77)], 

3 

V? = XV>. (3.78) 

i= i 

then, as in Section 2.6, the quantities (V), V 2 , Vj) are defined as the components 
of a vector that stay fixed while the coordinates rotate; that is, a vector is de¬ 
fined in terms of transformation properties of its components under a rotation 
of the coordinate axes. In a sense, the coordinates of a point have been taken 
as a prototype vector. The power and usefulness of this definition became 
apparent in Chapter 2, in which it was extended to define pseudovectors and 
tensors. 

From Eq. (3.76) we can derive interesting information about a.y that de¬ 
scribe the orientation of coordinate system (x' 1 , x 2 , .xj) relative to the system 
(pci, x 2 , .X;j). The length from the origin to the point is the same in both systems. 
Squaring, 13 

X^i = X = X ( X a v x i) ( X! aikXk ) = X x J Xk X ai i aik ■ (3 - 79) 

i i i \ j } \ k / j,k i 

This can be true for all points if and only if 

a ij a ik = 8 jk, j, k= 1,2, 3. (3.80) 


13 Note that two independent indices j and k are used. 
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Note that Eq. (3.80), the orthogonality condition, does not conform to our 
definition of matrix multiplication, but it can be put in the required form by 
using the transpose matrix A such that 


CXji — dij . 

(3.81) 

Then Eq. (3.80) becomes 


AA = 1. 

(3.82) 

This is a restatement of the orthogonality condition and 
definition of orthogonality. Multiplying Eq. (3.82) by A -1 
using Eq. (3.55), we have 

may be taken as a 
from the right and 

A = A -1 . 

(3.83) 

This important matrix result that the inverse equals the transpose holds 
only for orthogonal matrices and indeed may be taken as a further restate¬ 
ment of the orthogonality condition. 

Multiplying Eq. (3.83) by A from the left, we obtain 

AA = 1 

(3.84) 

or 


^ ^ Q’ji&ki — &jkj 
i 

(3.85) 

which is still another form of the orthogonality condition. 

Summarizing, the orthogonality condition may be stated in several equiva¬ 
lent ways: 

^ ^ Q'ij&ik — &jk 
i 

(3.86) 

^ ' Q'jiQ'ki — &jk 
i 

(3.87) 

AA = AA = 1 

(3.88) 

A = A -1 . 

(3.89) 


Any one of these relations is a necessary and a sufficient condition for A to be 
orthogonal. 

Taking determinants of Eq. (3.84) implies (recall |A| = |A|) 


|A||A| = |A| 2 = 1, so that |A| = ±1. 

Moreover, we may describe Eq. (3.79) as stating that rotations leave lengths 
invariant. Verification of Eq. (3.80), if needed, may be obtained by returning to 
Eq. (3.79) and setting 


r = (x\, x 2 , xa) = (1, 0, 0), (0, 1, 0), (0, 0, 1), (1, 1, 0), 
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and so on to evaluate the nine relations given by Eq. (3.80). This process is valid 
since Eq. (3.79) must hold for all r for a given set of <%. Equation (3.80), a con¬ 
sequence of requiring that the length remain constant (invariant) under 
rotation of the coordinate system, is called the orthogonality condition. 
The a,ij, written as a matrix A, define an orthogonal matrix. Finally, in matrix 
notation Eq. (3.76) becomes 


| od) — A|a;). 


(3.90) 


Orthogonality Conditions: Two-Dimensional Case 

A better understanding of cty and the orthogonality condition may be gained 
by considering rotation in two dimensions in detail. (This can be thought of 
as a three-dimensional system with the aq-, a^-axes rotated about From 
Fig. 3.6, 


x[ = x\ cos ip + x 2 sin ip, 
x' 2 = —X\ sin tp + x2 cos ip. 


Therefore, by Eq. (3.81), 


A = 


cos ip sin ip 
— sin ip cos ip 


(3.91) 


(3.92) 


Notice that A reduces to the unit matrix for ip — 0. Zero angle rotation means 
nothing has changed. It is clear from Fig. 3.6 that 


a n = cos ip = cos(aq, aq), 


ai 2 — sin qj = cos 



= cos(aq, X 2 ), 


(3.93) 


Figure 3.6 



Rotation 
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and so on, thus identifying the matrix elements ciy as the direction cosines. 
Equation (3.80), the orthogonality condition, becomes 

sin 2 <p + cos 2 <p = 1 , 

(3.94) 

sin (p cos <p — sin <p cos ip = 0 . 


The extension to three dimensions (rotation of the coordinates through an 
angle <p counterclockwise about xg) is simply 

/ cos cp sin ip 0 \ 

A= —sin <p cos ip 0 . (3.95) 

V 0 0 1/ 

The matrix element a .33 being equal to 1 expresses the fact that x' 3 = x 3 since 
the rotation is about the ap-axis. The zeros guarantee that x\ and x' 2 do not 
depend on xg and that x 3 does not depend on x\ and xg. 

It is now possible to see and understand why the term orthogonal is ap¬ 
propriate for these matrices. We have the general form 



( a n 

<*12 

<*13 ^ 

A = 

a 21 

<*22 

<*23 


\«31 

<*32 

<*33 ) 


a matrix of direction cosines in which a,j is the cosine of the angle between 
x! i and Xj. Therefore, an, ai 2) ai 3 are the direction cosines of x\ relative to 
X\, X2, X3. These three elements of A define a unit length along x\, that is, a 
unit vector xp 

X; = xia n + x 2 ai 2 + x 3 a 13 . 

The orthogonality relations [Eqs. (3.86-3.89)] are simply a statement that the 
unit vectors x'j, x), and X 3 are mutually perpendicular or orthogonal. Our 
orthogonal transformation matrix A transforms one orthogonal coordinate 
system into a second orthogonal coordinate system by rotation [and reflection 
if det(A) = — 1], 

As an example of the use of matrices, the unit vectors in spherical polar 
coordinates may be written as 


/r\ 

e 

w 


= c 


(±\ 

y 

W 


(3.96) 


where C is given in Exercise 2.5.1. This is equivalent to Eq. (3.68) with x], x), 
and Xg replaced by r, 0, and (p. From the preceding analysis, C is orthogonal. 
Therefore, the inverse relation becomes 


/ x\ 


(r\ 

y 

= C ' 1 

e 

Vv 


U/ 


= C 


/r\ 

e 

w 


(3.97) 
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Figure 3.7 

(a) Rotation About .r :j 
Through Angle a., 

(b) Rotation About x' t 
Through Angle (3, and 

(c) Rotation About x 'ft 
Through Angle 7 



and Exercise 2.5.5 is solved by inspection. Similar applications of matrix in¬ 
verses appear in connection with the transformation of a power series into 
a series of orthogonal functions (Gram-Schmidt orthogonalization in Section 
9.3). 


Euler Angles 


Our transformation matrix A contains nine direction cosines. Clearly, only 
three of these are independent; Eq. (3.80) provides six constraints. Equiva¬ 
lently, we may say that two parameters (0 and <p in spherical polar coordi¬ 
nates) are required to fix the axis of rotation. Then one additional parameter 
describes the amount of rotation about the specified axis. In the Lagrangian 
formulation of mechanics (Section 17.3) it is necessary to describe A by using 
some set of three independent parameters rather than the redundant direction 
cosines. The usual choice of parameters is the Euler angles . 14 

The goal is to describe the orientation of a final rotated system (x'ft, xft , xft) 
relative to some initial coordinate system (aq, X 2 , X3). The final system is de¬ 
veloped in three steps, with each step involving one rotation described by one 
Euler angle (Fig. 3.7): 


1. The coordinates are rotated about the X;j-axis through an angle a counter¬ 
clockwise relative to X\, X 2 , X3 into new axes denoted by xft xft xft (The X 3 - 
and xAaxes coincide.) 

2. The coordinates are rotated about the Xg-axis 16 through an angle ft counter¬ 
clockwise relative to x [, xft , xft into new axes denoted by x'ft xft, xft. (The xft- 
and the xft -axes coincide.) 

3. The third and final rotation is through an angle y counterclockwise about 
the xftft - axis, yielding the x'ft, x'ft, x'ft system. (The xftft - and xT-axes coincide.) 


14 There are almost as many definitions of the Euler angles as there are authors. Here, we follow 
the choice generally made by workers in the area of group theory and the quantum theory of 
angular momentum (compare Sections 4.3 and 4.4). 

16 Some authors choose this second rotation to be about the xj-axis. 
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The three matrices describing these rotations are 


and 



f 

cos a 

sin a 

°\ 


R 2 (a) = 


- sin a 

cos a 

0 

, (3.98) 



0 

0 

V 


but different sign 

’or p in 





COS P 

0 — sin p N 


Ry(P) = 


0 

1 0 


(3.99) 


\ 

sin/I 

0 cos 

P 



( 

cos y 

sin y 

o\ 


R*(y) = 


- siny 

cosy 

0 

(3.100) 


K 

0 

0 

V 



The total rotation is described by the triple matrix product: 


A (a, P, y) = R z (y)Ry(P)R z (a). (3.101) 


Note the order: R z (a) operates first, then R y (P), and finally R,(y). Direct 
multiplication gives 

( cos y cos P cos a — sin y sin a 

A (a, P, y) = 


— sin y cos p cos a — cos y sin a 
sin P cos a 


(3.102) 


cos y cos P sin a + sin y cos a — cos y sin p \ 

— sin y cos p sin a + cos y cos a sin y sin p 

sin p sin a cos p y 

Equating A(a,y) with A(a, p, y), element by element, yields the direction 
cosines in terms of the three Euler angles. We could use this Euler angle iden¬ 
tification to verify the direction cosine identities [Eq. (2.106) of Section 2.6], 
but the approach of Exercise 3.3.3 is much more elegant. 


Biographical Data 

Euler, Leonhard. Euler, a Swiss mathematician, was bom in Basel in 1707 
and died in St. Petersburg, Russia, in 1783. He studied with the Bernoulli 
brothers and eventually became the most prolific mathematician of all time, 
contributing to all then-existing branches of mathematics, and he founded 
new ones such as graph theory. There are many Euler identities, equations, 
and formulas, for example, for the exponential function of pure imaginary 
argument, another relating Bernoulli numbers to the Riemann zeta function 
at even integer argument, or in the calculus of variations. In 1733, he became 
professor at the St. Petersburg Academy of Catherine I, widow of Peter the 
Great. He left Russia during the terror reign of Ivan IV for Frederic the Great’s 
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court of Prussia but returned to Russia when Catherine the Great invited 
him back. He lost both eyes, one due to solar observations and the other 
in 1766 due to intense work on mathematics, but continued his productive 
work to his death. 


Symmetry Properties and Similarity Transformations 


Our matrix formalism leads to the Euler angle description of rotations (which 
forms a basis for developing the rotation group in Chapter 4). The power and 
flexibility of matrices pushed quaternions into obscurity early in the century. 16 

It will be noted that matrices have been handled in two ways in the foregoing 
discussion: by their components and as single entities. Each technique has its 
own advantages and both are useful. 

The transpose matrix is useful in a discussion of symmetry properties. If 


A = A, ckj = a,ji, 


(3.103) 


the matrix is called symmetric, whereas if 


A = -A, ay = -a#, (3.104) 

it is called antisymmetric or skewsymmetric. The diagonal elements of an 
antisymmetric matrix vanish. It is easy to show that any square matrix may 
be written as the sum of a symmetric matrix and an antisymmetric matrix. 
Consider the identity 

A= i[A + A] + i[A —A], (3.105) 

[A + A] is clearly symmetric, whereas [A — A] is clearly antisymmetric. This 
is the matrix analog of Eq. (2.120) for tensors. Similarly, a function may be 
broken up into its even and odd parts. 

So far, we have interpreted the orthogonal matrix as rotating the coordi¬ 
nate system. This changes the components of a fixed vector (not rotating with 
the coordinates) (see Fig. 3.6). However, an orthogonal matrix A may be in¬ 
terpreted equally well as a rotation of the vector in the opposite direction 
(Fig. 3.8). These are the two possibilities: active transformation—rotating the 
vector keeping the coordinates fixed; and passive transformation—rotating 
the coordinates (in the opposite sense) keeping the vector fixed. 

Suppose we interpret matrix A as a linear transformation of a vector r into 
the position shown by rp that is, in a particular coordinate system we have a 
relation 


ri = Ar. 


(3.106) 


16 Stephenson, R. J. (1966). Development of vector analysis from quaternions. Am. J. Phys. 34, 
194. 
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Figure 3.8 
Fixed 

Coordinates-rotated 

Vector 



Now let us transform the coordinates linearly by applying matrix B, which 
transforms (x, y, z) into {pc', ]/, z r ): 

rj = Bri = BAr = BA(B -1 B)r 

= (BAB -1 )Br = (BAB -1 )r'. (3.107) 

Bri is just rj in the new coordinate system with a similar interpretation holding 
for Br. Hence, in this new system (Br) is transformed into position (Bri) by 
the matrix BAB 1 : 

B^ = (BAB- 1 ) Br 

4' j" 4* 

rj = A' r'. 

In the new system the coordinates have been transformed by matrix B, and A 
has the form A', in which 


A' = BAB -1 . (3.108) 

A' operates in the x', y', z' space as A operates in the x, y, z space. 

For the determinant | A'| the product theorem gives 

|A'| = |B||A||B -1 | = |A|. (3.109a) 


For the trace we find 

trace(BAB -1 ) = trace(B -1 BA) = trace(A) (3.109b) 

so that both the trace and the determinant of a square matrix stay invariant 
under a coordinate transformation, and rotations in particular. 
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The coordinate transformation defined by Eq. (3.108) with B any matrix, 
not necessarily orthogonal, is known as a similarity transformation. In com¬ 
ponent form, Eq. (3.108) becomes 

a'ij = J2b ik a k i( B _1 ) y 

k,l 

Now if B is orthogonal, 

(B ')ij = (6 \j = bji, 

and we have 

= Y b ^ ak > b fl ■ ( 3 - 1 12 ) 

k,l 

It may be helpful to think of A again as an operator, possibly as rotating 
coordinate axes, relating angular momentum and angular velocity of a rotating 
solid (Section 3.5). Matrix A is the representation in a given coordinate sys¬ 
tem. However, there are directions associated with A-crystal axes, symmetry 
axes in the rotating solid, and so on so that the representation A depends on 
the coordinates. The similarity transformation shows how the representation 
changes with a change of coordinates. 

Relation to Tensors 

Comparing Eq. (3.112) with the equations of Section 2.6, we see that it defines 
a tensor of second rank. Hence, a matrix that transforms by an orthogonal 
similarity transformation is, by definition, a tensor. Clearly, then, any matrix 
A, interpreted as linearly transforming a vector [Eq. (3.106)], may be called a 
tensor. If, however, we consider an orthogonal matrix as a collection of fixed 
direction cosines, giving the new orientation of a coordinate system, there is 
no tensor transformation involved. 

The symmetry and antisymmetry properties defined earlier are preserved 
under orthogonal similarity transformations. Let A be a symmetric matrix, 
A = A, and 

A' = BAB' 1 . (3.113) 

Now 

K = B^AB = BAB' 1 (3.114) 

since B is orthogonal. However, A = A. Therefore, 

A' = BAB” 1 = A', (3.115) 

showing that the property of symmetry is invariant under an orthogonal similar¬ 
ity transformation. In general, symmetry is not preserved under a nonorthog- 
onal similarity transformation. 


(3.110) 

(3.111) 


SUMMARY 


Matrices that define rotations are composed of mutually orthogonal row and 
column vectors. Their inverse is the transposed matrix. 
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EXERCISES 

Note. Assume all matrix elements are real. 

3.3.1 Show that the product of two orthogonal matrices is orthogonal. Show 
that the inverse of an orthogonal matrix is orthogonal. 

Note. These are key steps in showing that all nxn orthogonal matrices 
form a group (Section 4.1). 

3.3.2 If A is orthogonal, show that its determinant | A| = ±1. 

3.3.3 If A is orthogonal and detA = +1, show that (det A)a, ; - = Cjj, where 
G'ij is the cofactor of Oy. This yields the identities of Eq. (2.106) used 
in Section 2.6 to show that a cross product of vectors (in three-space) 
is itself a vector. 

Hint. Note Exercise 3.2.24. 

3.3.4 Another set of Euler rotations in common use is 

(1) a rotation about the .'Zj-axis through an angle <p, counterclockwise, 

(2) a rotation about the a^-axis through an angle 9, counterclockwise, 

(3) a rotation about the Xg-axis through an angle i//, counterclockwise. 
If 

a = (p — jt/2 (p — a+it/2 

P=9 6 = P 

y = x[r + 7r/2 i Is = y - jt/2, 
show that the final systems are identical. 

3.3.5 Suppose the earth is moved (rotated) so that the north pole goes to 
30° north, 20° west (original latitude and longitude system) and the 10° 
west meridian points due south. 

(a) What are the Euler angles describing this rotation? 

(b) Find the corresponding direction cosines. 

/ 0.9551 -0.2552 -0.1504\ 

ANS. (6) A = 0.0052 0.5221 -0.8529 . 

\ 0.2962 0.8138 0.5000/ 

3.3.6 Verify that the Euler angle rotation matrix, Eq. (3.102), is invariant 
under the transformation 

a —> a + tt, P —» —/J, y —> y — n. 

3.3.7 Show that the Euler angle rotation matrix A(a, p, y) satisfies the fol¬ 
lowing relations: 

(a) A _1 (a!, P, x) = A(a, p, y) 

(b) A~V, P, K) = A(—x, ~P, -a). 

3.3.8 Show that the trace of the product of a symmetric and an antisymmetric 
matrix is zero. 

3.3.9 Show that the trace of a matrix remains invariant under similarity trans¬ 
formations. 
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3.3.10 Show that the determinant of a matrix remains invariant under similar¬ 
ity transformations. 

Note. These Exercises 3.3.9 and 3.3.10 show that the trace and the de¬ 
terminant are independent of the coordinates. They are characteristics 
of the matrix (operator). 

3.3.11 Show that the property of antisymmetry is invariant under orthogonal 
similarity transformations. 

3.3.12 A is 2 x 2 and orthogonal. Find its most general form. Compare with 
two-dimensional rotation. 

3.3.13 |x) and |y) are column vectors. Under an orthogonal transformation S, 
|x'> = S|x), |y'> = S|y). Show that the scalar product (x|y) is invariant 
under this orthogonal transformation. 

Note. This is equivalent to the invariance of the dot product of two 
vectors (Section 2.6). 

3.3.14 Show that the sum of the squares of the elements of a matrix remains 
invariant under orthogonal similarity transformations. 

3.3.15 A rotation (p\ + <p 2 about the 2 -axis is carried out as two successive 
rotations cp\ and <p ->, each about the 2 -axis. Use the matrix representation 
of the rotations to derive the trigonometric identities: 

cos(<pi + <P 2 ) = cos (pi cos cp 2 — sin cp\ sin cp 2 
sin(</?i + q> 2 ~) = sin <p\ cos <p 2 + cos <p i sin <p 2 ■ 

3.3.16 A column vector V has components \\ and V 2 in an initial (unprimed) 
system. Calculate V[ and V) for a 

(a) rotation of the coordinates through an angle of 0 counterclock¬ 
wise, 

(b) rotation of the vector through an angle of 6 clockwise. 

The results for parts (a) and (b) should be identical. 
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3.4 Hermitian Matrices and Unitary Matrices 


Definitions 


Thus far, it has generally been assumed that our linear vector space is a real 
space and that the matrix elements (the representations of the linear operators) 
are real. For many calculations in classical physics, real matrix elements will 
suffice. However, in quantum mechanics complex variables are unavoidable 
because of the form of the basic commutation relations (or the form of the 
time-dependent Schrodinger equation). With this in mind, we generalize to the 
case of complex matrix elements. To handle these elements, let us define, or 
label, some new properties: 






3.4 Hermitian Matrices and Unitary Matrices 


207 


1. Complex conjugate, A* = («,*).), formed by taking the complex conjugate 
(i —> —i) of each element a* oi' A, where i = \/—I. 

2. Adjoint, A , formed by transposing A*, 

At = (a ik y = A* = A* = «). (3.116) 

3. Hermitian matrix: The matrix A is labeled Hermitian (or self-adjoint) if 

A = A f . (3.117) 

If A is real, then At = A and real Hermitian matrices are real symmetric 
matrices. In quantum mechanics (or matrix mechanics) matrices are usually 
constructed to be Hermitian. 

4. Unitary matrix: Matrix U is labeled unitary if 

lit = IT 1 . (3.118) 

If U is real, then U _1 = 0 so that real unitary matrices are orthogonal 
matrices. This represents a generalization of the concept of orthogonal 
matrix [compare Eq. (3.83)]. 

5. (AB)* = A*B*, (AB)t = B+A+. 

If the matrix elements are complex, the physicist is almost always con¬ 
cerned with Hermitian and unitary matrices. Unitary matrices are especially 
important in quantum mechanics because they leave the length of a (com¬ 
plex) vector unchanged —analogous to the operation of an orthogonal ma¬ 
trix on a real vector. It is for this reason that the S matrix of scattering theory 
is a unitary matrix. The transformation of an operator A s in the Schrodinger 
representation of quantum mechanics to the Heisenberg representation A H = 
U A S U with the unitary evolution operator is another example. 

In a complex ^dimensional linear space the square of the length of a 
vector x = {pc\, X 2 ,, .%'„), or the square of its distance from the origin 0, is 
defined as 

n n 

x f x = = ^2 i^i 2 - 

i= 1 i= 1 

If a coordinate transformation y = Ux leaves the distance unchanged, then 
x+x = yty = (Ux)+Ux = x^llx. 

Since x is arbitrary it follows that LI U = 1„; that is, U is a unitary n x n matrix. 

If x' = Ax is a linear map, then its matrix in the new coordinates becomes 
the unitary (complex analog of a similarity) transformation 

A' = UAU f (3.119) 


because 


Ux' = y' = UAx = UAirV = UAU f y- 
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Once we have the notion of length in a complex vector space, we can proceed 

to define a scalar or inner product of vectors as 

J] yt%i = y T • x = (y\x). 

i 

Two vectors are orthogonal if their scalar product is zero, ( y\x) =0. The Pauli 
spin matrices introduced next are also unitary. 



Pauli Matrices 


The set of three 2x2 Pauli matrices <x, 


o' 1 = 





(3.120) 


were introduced by W. Pauli to describe a particle of spin 1 /2 in nonrelativistic 
quantum mechanics. It can readily be shown that (compare Exercises 3.2.9 
and 3.2.10) the Pauli a satisfy 


(Tiffj + (XjCXj = 28ijl2, anticommutation (3.121) 

ijjcjj = io k , cyclic permutation of indices (3.122) 

(p.if = 1 2 , (3.123) 

where 1 2 is the 2x2 unit matrix. Thus, the vector rr/2 satisfies the same 
commutation relations 


[pi, crj] = apj - OjOi = 2is ijk o k (3.124) 

as the orbital angular momentum L (L x L = iL; see Exercise 2.5.11). 

The three Pauli matrices er and the unit matrix form a complete set of 2 x 2 
matrices so that any 2x2 matrix M may be expanded as 

M = m 0 1 + toicti + m 2 cr 2 + m 3 a 3 = mo + m ■ er, (3.125) 

where the m, form a constant vector m = (mi, m 2 , rn 3 ). Using of = 1 and 
trace(a,) = 0, we obtain from Eq. (3.125) the expansion coefficients m, by 
forming traces, 

2mo = trace(M), 2 mi = trace(Meri), i = 1, 2, 3. (3.126) 

Adding and multiplying such 2x2 matrices, we generate the Pauli algebra. 17 
Note that trace(er;) = 0 for i = 1, 2, 3. 

The spin algebra generated by the Pauli matrices is a matrix representa¬ 
tion of the four-dimensional Clifford algebra, whereas Hestenes and cowork¬ 
ers have developed in their geometric calculus a representation-free (i.e., 
coordinate-free) algebra that contains complex numbers, vectors, the quater¬ 
nion subalgebra, and generalized cross products as directed areas (called 
bivectors). This algebraic-geometric framework is tailored to nonrelativistic 


17 For its geometrical significance, see Baylis, W. E., Huschilt, J., and Jiansu Wei (1992). Am. J. 
Phys. 60, 788. 
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SUMMARY 


quantum mechanics where spinors acquire geometric aspects and the Gauss’s 
and Stokes’s theorems appear as components of a unified theorem. 

The discussion of orthogonal matrices in Section 3.3 and unitary matrices 
in this section is only a beginning. Further extensions are of vital concern in 
modern particle physics. With the Pauli matrices, we can develop spinor wave 
functions for electrons, protons, and other spin | particles. 


Unitary matrices for a complex vector space play the role that orthogonal 
matrices perform in a real vector space: they represent coordinate transfor¬ 
mations. Therefore, unitary matrices are composed of mutually orthogonal 
unit vectors. Hermitian matrices define quadratic forms in complex spaces 
and are complex analogs of real symmetric matrices. Both have numerous 
applications in quantum mechanics. 


Biographical Data 

Pauli, Wolfgang. Pauli, an Austrian American theoretical physicist, was 
bom in Vienna, Austria, in 1900 and died in Zurich, Switzerland, in 1958. He 
was a prodigy, publishing in his teens a lucid review of relativity while still a 
student in Sommerfeld’s group in Munich, Germany, where he obtained his 
Ph.D. in 1921. In 1925, he found the exclusion principle named after him, 
for which he was eventually honored with the Nobel prize in 1945. He was a 
scathing critic of numerous physicists of his times (“conscience of physics”). 
With Heisenberg, and independently of Dirac, he founded quantum field 
theory and QED. In 1931, he proposed the neutrino (thus named by Fermi) 
to conserve energy and quantum numbers in the weak interaction in a letter 
to colleagues at a conference but never published this suggestion for fear of 
ridicule or making a mistake. The elusive neutrino was finally tracked down 
experimentally in 1956 and F. Reines received the Nobel prize in 1995. 


EXERCISES 

3.4.1 Show that 

det(A*) = (det A)* = det(A t ). 

3.4.2 Three angular momentum matrices satisfy the basic commutation 
relation 


[-U, -1 y\ — 

(and cyclic permutation of indices). If two of the matrices have real 
elements, show that the elements of the third must be pure imaginary. 

3.4.3 Show that (AB)t = B t A t . 

3.4.4 A matrix C = S S. Show that the trace is positive definite unless S is 
the null matrix, in which case trace(C) = 0. 

3.4.5 If A and B are Hermitian matrices, show that (AB + BA) and i(AB — BA) 
are also Hermitian. 
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3.4.6 The matrix C is not Hermitian. Show that C + C and i(C — C ) are 
Hermitian. This means that a non-Hermitian matrix may be resolved 
into two Hermitian parts: 

c = i(c + ct) + ii(c-ct). 

This decomposition of a matrix into two Hermitian matrix parts par¬ 
allels the decomposition of a complex number z into x + iy, where 

x — (z + s*)/2 and y—(z— sf}/2i. 

3.4.7 A and B are two noncommuting Hermitian matrices: 

AB — BA = iC. 

Prove that C is Hermitian. 

3.4.8 Show that a Hermitian matrix remains Hermitian under unitary simi¬ 
larity transformations. 

3.4.9 Two matrices A and B are each Hermitian. Find a necessary and suffi¬ 
cient condition for their product A B to be Hermitian. 

ANS. [A, B] = 0. 

3.4.10 Show that the reciprocal (i.e., inverse) of a unitary matrix is unitary. 
Show that the product of two unitary matrices is unitary. 

Hint. These are key steps in proving that unitary matrices form a group. 

3.4.11 A particular similarity transformation yields 

A' = UAIT 1 
A' f = UAtlT 1 . 

If the adjoint relationship is preserved (A^ = A') and det U = 1, show 
that U must be unitary. 

3.4.12 Two matrices U and H are related by 

U = e iaH 


with a real. (The exponential function is defined by a Maclaurin expan¬ 
sion. This will be done in Section 5.11.) 

(a) If H is Hermitian, show that U is unitary. 

(b) If U is unitary, show that H is Hermitian. (H is independent of a.) 
Note. With H the Hamiltonian, 

i jf(pc, i) = f) ijf(x, 0) = exp(— itb\/h)xfr(x, 0) 

is a solution of the time-dependent Schrodinger equation. U (x, t) = 
exp(— itH/h) is the evolution operator. 

3.4.13 An operator T(t. + e, i) describes the change in the wave function from 
t to t. + e. For e real and small enough so that e 2 may be neglected 

T(t + e, t) = 1 — -eH(f). 

h 
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(a) If T is unitary, show that H is Hermitian. 

(b) If H is Hermitian, show that T is unitary. 

Note. When H(f) is independent of time, this relation may be put in 
exponential form (Exercise 3.4.12). 

3.4.14 (a) Given r' = Ur, with U a unitary matrix and r a (column) vector with 
complex elements, show that the norm (magnitude) of r is invariant 
under this operation. 

(b) The matrix U transforms any column vector r with complex ele¬ 
ments into r' leaving the magnitude invariant: r r = r' r'. Show 
that U is unitary. 


3.5 Diagonalization of Matrices 


X 


Moment of Inertia Matrix 


In Example 1.5.3, we analyzed a quadratic form G that can be defined by 
symmetric 2x2 matrix A as 


x 2 + xy+ y 2 = (x, y) 


1 1/2 \ (x\ 

1/2 1 ){y) 


(x|A|x), 


where (x| = (x, y) is the two-dimensional (transpose) coordinate vector. We 
found that G(x, y) = 1 is an ellipse with rotated axes. In the rotated coordinate 
system along the major and minor axes x ± y of the ellipse G becomes the 
familiar sum of squares 3(x + y) 2 + (x — yf = 4. This situation suggests that 
using a suitable rotation of coordinates, we can transform a general quadratic 
form into a sum of squares, that is, diagonalize the associated (real) symmetric 
matrix. 

In many physical problems involving real symmetric or complex Hermitian 
matrices it is desirable to carry out a (real) orthogonal similarity transforma¬ 
tion or a unitary transformation (corresponding to a rotation of the coordinate 
system) to reduce the matrix to a diagonal form, with nondiagonal elements all 
equal to zero. One particularly direct example is the moment of inertia matrix 
I of a rigid body. From the definition of angular momentum L we have 


L = lea, (3.127) 

where uj is the angular velocity. 18 The inertia matrix I is found to have diagonal 
components 


7ii = ^ TOj(r f — xf), and so on, 

i 


(3.128) 


ls The moment of inertia matrix may also be developed from the kinetic energy of a rotating body, 
T = 1/2<oj| I |oj) defined as in Eq. (3.130) with the three-dimensional vector |oj) replacing n. 
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where the subscript i refers to mass nij located at r, = (pa, yi, Z;). For the 
nondiagonal components we have 



(3.129) 


By inspection, matrix I is symmetric. Also, since I appears in a physical equation 
of the form (3.127), which holds for all orientations of the coordinate system, 
it may be considered to be a tensor (quotient rule; Section 2.8). Note that L 
and uj are not necessarily parallel, except when lo is along special directions 
(the principal axes that we want to find). 

The key now is to orient the coordinate axes (along a body-fixed frame) so 
that the / 12 and the other nondiagonal elements will vanish. As a consequence 
of this orientation and an indication of it, if the angular velocity is along one 
such realigned principal axis, the angular velocity and the angular momentum 
will be parallel. As an illustration, the stability of rotation is used by football 
players when they throw the ball spinning about its long principal axis. 


P Eigenvectors and Eigenvalues 


It is instructive to consider a geometrical picture of this principal axes problem. 
If the inertia matrix I is multiplied from each side by a unit vector of variable 
direction, n = (n \, , n 3 ~), then in the Dirac bracket notation of Section 3.2 as 
for the quadratic form G — (x|A|x), 



(n|l|n> = (n h n 2 , rc 3 ) hi hi hi 
\ hi hi hi / 


(3.130) 


where / is the moment of inertia about the direction n. Carrying out the mul¬ 
tiplications, we obtain 

I = Iufil + hml + hml + 2 / 12 TO 1 TO 2 + 2Ii 3 mn 3 + 2hin 3 n 3 , (3.131) 

a positive definite quadratic form that must be an ellipsoid (Fig. 3.9) because 
the moment of inertia about any axis is a positive observable. From analytic 
geometry and Example 1.5.3 we know that the coordinate axes can be rotated 
to coincide with the axes of our ellipsoid. In many elementary cases, especially 
when symmetry is present, these new axes, called the principal axes, can be 
found by inspection. This ellipsoid is a three-dimensional generalization of 
Example 1.5.3. Thus, we can find the axes by locating the local extrema of the 
ellipsoid in terms of the variable components of n, subject to the constraint 
n 2 = 1. To deal with the constraint, we introduce a Lagrange multiplier X, just 
as we did in Example 1.5.4. Differentiating (n|l|n> — /.(h|n), 



(3.132) 
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Figure 3.9 

Moment of Inertia 
Ellipsoid 



yields the eigenvalue equation 

l|n) = A.|n). (3.133) 


The same result can be found by purely geometric methods. We now proceed 
to develop the geometric method of finding the diagonal elements and the 
principal axes. 

If R _1 = R is the real orthogonal matrix such that n' = Rn, or |n') = R|n) 
in Dirac notation, are the coordinates along the principal axes, then we obtain 
using (n'|R = (n| in Eq. (3.130), 

(n|l|n) = (n'|RIR|n') = I[nf + I^n'% + I^nf, (3.134) 


where I- > 0 are the principal moments of inertia. The inertia matrix I' in 
Eq. (3.134) is diagonal in the new coordinates, 



(I[ 

0 

°\ 

1' = Rl R = 

0 

I 2 

0 


1° 

0 


If we rewrite Eq. (3.135) using R 1 = 

R in the form 


(3.135) 


Rl' = IR, 


(3.136) 


and take R = (v 1; v 2 , v 3 ) to consist of three column vectors, then Eq. (3.136) 
splits into three eigenvalue equations 


lv,; = I-Vi, i = 1, 2, 3 (3.137) 

with eigenvalues I[ and eigenvectors v,. The names were introduced from 
the early German literature on quantum mechanics. Because these equations 
are linear and homogeneous (for fixed i), by Section 3.1 their determinants 
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have to vanish (for a nontrivial solution to exist): 


hi - 

Il2 

h‘\ 



h-2 

122 ~ I'i 

ha 

= 0 . 

(3.138) 

ha 

h:i 

ha - K 




Replacing the eigenvalue I' by a variable X times the unit matrix 1, we may 
rewrite Eq. (3.137) as 

(I - U)|v> = 0, (3.139) 

whose determinant 

|l — Al| = 0, (3.140) 

is a cubic polynomial in A; its three roots, of course, are the I-. Substituting 
one root at a time back into Eq. (3.137) [or Eq. (3.139)], we can find the corre¬ 
sponding eigenvectors from Eq. (3.135). 

Because of its applications in astronomical theories, Eq. (3.138) [or Eq. 
(3.140)] is known as the secular equation. 19 This method is also used in 
quantum mechanics to find the eigenvalues and eigenvectors of observables 
such as the Hamiltonian or angular momentum. The same treatment applies 
to any real symmetric matrix I, except that its eigenvalues need not all be 
positive, but they are real, of course. Because Eq. (3.137) is homogeneous, 
eigenvectors can always be normalized to unity, remaining eigenvectors. Also, 
the orthogonality condition in Eqs. (3.86)-(3.89) for R states that, in geometric 
terms, the eigenvectors v; are mutually orthogonal unit vectors. (It is possible 
for two eigenvectors to have the same eigenvalue, which then are called degen¬ 
erate eigenvalues.) Indeed, they form the new coordinate system. The fact that 
any two eigenvectors v,-, v 7 - are orthogonal if I[ ^ /'■ follows from Eq. (3.137) in 
conjunction with the symmetry of I by multiplying with V; and v ; -, respectively, 

(v 7 111 Vj> = I-Vj ■ Vi = (Vi111 Vj) = I'jVi ■ Vj. (3.141) 

Since /■ ^ /'■ and Eq. (3.137) implies that (/'■ — I-)Vi ■ vj = 0, v, : • Vj = 0. 


Hermitian Matrices 


For complex vector spaces Hermitian and unitary matrices play the same role 
as symmetric and orthogonal matrices for real vector spaces, respectively. 
First, let us generalize the important theorem about the diagonal elements and 
the principal axes for the eigenvalue equation 


A|r> = A|r). (3.142) 

We now show that if A is a Hermitian matrix, 20 its eigenvalues are real and its 
eigenvectors orthogonal. 


19 Equation (3.127) will take on this form when co is along one of the principal axes. Then L = Xu; 
and \u; = Xu;. In the mathematics literature X is usually called a characteristic value and u; a 
characteristic vector. 

20 If A is real, the Hermitian requirement is replaced by a requirement of symmetry. 
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Let ki and kj be two eigenvalues and r,;> and r ; ) the corresponding eigen¬ 
vectors of A, a Hermitian matrix. Then 


A Tj) = kilTi), 

(3.143) 

A|r j) = kj\rj). 

(3.144) 

Equation (3.143) is multiplied by (r 7 : 


( r i A rj) = ki{rj\r.i). 

(3.145) 

Equation (3.144) is multiplied by (r, to give 


( r ;|A|ry) = kj(ri\rj). 

(3.146) 

Taking the adjoint 21 of this equation, we have 


( r jjA f r*) = k*j(rj\ri) 

(3.147) 

or 


( r j|A|ri) = X*j(rj\ri) 

(3.148) 


as A is Hermitian. Subtracting Eq. (3.148) from Eq. (3.147), we obtain 

(M - A.*-) (rji - 1 rj) = 0. (3.149) 

This is a general result for all possible combinations of i and j. First, let j = i. 
Then Eq. (3.149) becomes 

(*, - AjNrilr,) = 0. (3.150) 

Because (r^r*) = 0 would be a trivial solution [(r, = (0, 0, 0)] of Eq. (3.143), 
we conclude that 


ki 


_j * 

A'i j 


(3.151) 


or ki is real, for all i. 

Second, for i ^ j, and a, ^ kj, 

(M - Aj)( r j/'|r i ) = 0, (3.152) 

or 

(rjlr,;) = 0, (3.153) 

which means that the eigenvectors of distinct eigenvalues are orthogonal, 
Eq. (3.153) being our generalization of orthogonality in this complex space. 22 


21 Note (r j | = |rj-) t for complex vectors, and (r | (A — A. • 1) = 0 because (A - X ■ 1)+ = A - A* • 1. 
22 The corresponding theory for differential operators (Sturm-Liouville theory) appears in Sec¬ 
tion 9.2. 
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Iflj = A .j (degenerate case), r,} is not automatically orthogonal to r ; }, hut 
it may be made orthogonal. 23 Consider the physical problem of the moment 
of inertia matrix again. If x\ is an axis of rotational symmetry, then we will 
find that X 2 = A. 3 . Eigenvectors |r 2 ) and |r 3 ) are each perpendicular to the 
symmetry axis, |rj), but they lie anywhere in the plane perpendicular to |ri); 
that is, any linear combination of |r 2 > and |r 3 > is also an eigenvector. Consider 
(a 2 |r 2 > + a 3 |r 3 >) with a 2 and a 3 constants. Then 

A(a 2 |r 2 > + a 3 |r 3 }) = a 2 X 2 |r 2 ) + a 3 A. 3 |r 3 ) 

(3.154) 

= ^ 2 (a 2 |r 2 ) + a 3 |r 3 », 

as is to be expected because X\ is an axis of rotational symmetry. Therefore, if 
|ri> and |r 2 > are fixed, |r 3 ) may simply be chosen to lie in the plane perpendicu¬ 
lar to |ri) and also perpendicular to |r 2 ). A general method of orthogonalizing 
solutions, the Gram-Schmidt process, is applied to functions in Section 9.3 
but works for vectors similarly. 

The set of /t-ort hogonal eigenvectors of our n x n Hermitian matrix forms 
a complete set, spanning the u-dimensional (complex) space. Eigenvalues 
and eigenvectors are not limited to Hermitian matrices. All matrices have at 
least one eigenvalue and eigenvector. However, only Hermitian matrices have 
a complete set of orthogonal eigenvectors and all eigenvalues real. 



Anti-Hermitian Matrices 


Occasionally, in quantum theory we encounter anti-Hermitian matrices: 


A f = -A. 


Following the analysis of the first portion of this section, we can show that 


(a) The eigenvalues are pure imaginary (or zero). 

(b) The eigenvectors corresponding to distinct eigenvalues are orthogonal. 


The matrix R formed from the normalized eigenvectors is unitary. This 
anti-Hermitian property is preserved under unitary transformations. 


EXAMPLE 3.5.1 


Eigenvalues and Eigenvectors 

eigenvalues and eigenvectors of 


A = 


of a Real Symmetric Matrix 

/° 1 0 \ 

10 0 . 
v 0 0 0 j 


Find the 


(3.155) 


23 We are assuming here that the eigenvectors of the ■re-fold degenerate Xi span the corresponding re- 
chmensional space. This may be shown by including a parameter e in the original matrix to remove 
the degeneracy and then letting e approach zero. This is analogous to breaking a degeneracy in 
atomic spectroscopy by applying an external magnetic field (Zeeman effect). 
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EXAMPLE 3.5.2 


The secular equation is 


or 


1 

1 —X 

0 0 


0 

0 


= 0, 


—X(X 2 - 1) = 0, 


(3.156) 


(3.157) 


expanding by minors. The roots are X = —1, 0, 1. To find the eigenvector 
corresponding to X = —1, we substitute this value back into the eigenvalue 
equation [Eq. (3.142)], 


/ 

-1 

1 

°\ 

( x ) 


/ o \ 


1 

-1 

0 

y 

= 

0 

V 

0 

0 

-x) 

W 


W 


With X = — 1, this yields 

x+y = 0, z = 0. (3.159) 

Within an arbitrary scale factor, and an arbitrary sign (or phase factor), (rj | = 
(1, —1, 0). Note that (for real |r> in ordinary space) the eigenvector singles 
out a line in space. The positive or negative sense is not determined. This 
indeterminacy could be expected if we noted that Eq. (3.158) is homogeneous 
in |r> so that |ri> remains an eigenvector if multiplied by a nonzero constant. 
For convenience, we will require that the eigenvectors be normalized to unity, 
(rj |ri> = 1. With this choice of sign, 

= C3160) 

is fixed. For 1 = 0, Eq. (3.158) yields 

y = 0, x = 0. (3.161) 

(r 2 1 = (0, 0, 1) is a suitable eigenvector. Finally, for X = 1, we get 

— x+y—0, z — 0, (3.162) 


or 

(r 3 l = (TT°). (3.163) 

The orthogonality of ri, m, and r 3 , corresponding to three distinct eigenvalues, 
may be easily verified. ■ 


Degenerate Eigenvalues 


Consider 



0 0\ 
0 1 


V° 1 0/ 


(3.164) 
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The secular equation is 


or 


1-1 0 

0 -1 

0 1 


= 0 


(3.165) 


(1 - 1)(1 2 - 1) = 0, 1 = -1, 1, 1, (3.166) 


a degenerate case. If 1 = — 1, the eigenvalue equation (3.142) yields 


2x= 0, y + z = 0. 


A suitable normalized eigenvector is 


(ri| = 


1 _1 


For X = 1, we get 


-y+ 2 = 0. 


(3.167) 


(3.168) 


(3.169) 


Any eigenvector satisfying Eq. (3.169) is perpendicular to ri. We have an infinite 
number of choices. Suppose, as one possible choice, r 2 is taken as 

(r ! l = (°,T ; L), (3.170) 

which clearly satisfies Eq. (3.169). Then r 3 must be perpendicular to ri and 
may be made perpendicular to r 2 by 24 


r 3 = ri x r 2 = (1, 0, 0). 


(3.171) 



Normal Modes of Vibration 

We consider the vibrations of a classical model of the C0 2 molecule. It is an 
illustration of the application of matrix techniques to a problem that does not 
start as a matrix problem. It also provides an example of the eigenvalues and 
eigenvectors of an asymmetric real matrix. 


EXAMPLE 3.5.3 


Normal Modes Consider three masses on the a:-axis joined by springs as 
shown in Fig. 3.10. The spring forces are assumed to be linear (small displace¬ 
ments; Hooke’s law) and the mass is constrained to stay on the a>axis. 


24 The use of the cross product is limited to three-dimensional space (see Section 1.3). 
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Figure 3.10 
Double Oscillator 



Using a different coordinate for each mass, Newton’s second law yields the 
set of equations 

#1 = -^ 0*4 - X 2 ) 

k k 

x 2 — - (x 2 - xi) - (x 2 - X 3 ) (3.172) 

m m 

k 

X3 = - — (Xz- X 2 ). 

The system of masses is vibrating. We seek the common frequencies, co, such 
that all masses vibrate at this same frequency. These are the normal modes. 
Let 

= Xioe la>t , i = 1,2, 3. 


Substituting this set into Eq. (3.172), we may rewrite it as 


/ A 

M 

k 

M 

°\ 


( X\ ^ 


( arO 

_ k 

m 

2k 

m 

_ k_ 
m 


x 2 

= W 

x 2 

V o 

k 

M 

- ) 
M / 


V^/ 


\x 3 ) 


(3.173) 


with the common factor e lu>t divided out. We have an eigenvalue problem whose 
matrix is not symmetric. Therefore, the eigenvectors will not be mutually or¬ 
thogonal. The secular equation is 


This leads to 


k 2 
M ® 


k 
' M 


— — Ct> 2 


k k .2 

" M 


= 0. 



The eigenvalues are 


k k 2k 
M’ M + m ’ 


(3.174) 


all real. 


o? = 0, 
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The corresponding eigenvectors are determined by substituting the eigen¬ 
values back into Eq. (3.173), one at a time. For u> 2 = 0, this yields 

X\ — X2 = 0, — X\ + 2x-2 — %3 — o, —X2 + X3 = 0. 

Then, we get 


Xi = X2 = X3. 

This describes pure translation with no relative motion of the masses and 
no vibration. 

For co 2 = k/M, Eq. (3.173) yields 


Xi = — X3, X2 — 0. 

The two outer masses are moving in opposite direction. The center mass is 
stationary. 

For a> 2 = k/M + 2k/m the eigenvector components are 

2 M 

X\ = X3, X2 = - X\. 

m 

The two outer masses are moving together. The center mass is moving opposite 
to the two outer ones. The net momentum is zero. 

Any displacement of the three masses along the .r-axis can be described 
as a linear combination of these three types of motion: translation plus two 
forms of vibration. ■ 


Ill-Conditioned Systems 


A system of simultaneous linear equations may be written as 


A|x) = |y) or A V) = |x>, 


(3.175) 


with A and |y) known and |x) unknown. The reader may encounter examples 
in which a small error in |y) results in a larger error in |x). In this case, the 
matrix A is called ill-conditioned. With |<Sx) an error in |x) and |<Sy) an error in 
|y), the relative errors may be written as 


(<$x|<5x> 


XX 


1/2 


< K(A) 


(Sy\Sy) 

(y|y> J 


1/2 


(3.176) 


where K( A), a property of matrix A, is the condition number. For A Hermitian 
one form of the condition number is given by 26 


K( A) = 


I k | max 
I k | min 


(3.177) 


2B Forsythe, G. E., and Moler, C. B. (1967). Computer Solution of Linear Algebraic Systems. 
Prentice Hall, Englewood Cliffs, NJ. 
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An approximate form due to Turing 26 is 

K{ A) = (3.178) 

in which n is the order of the matrix and [Ay ] max is the maximum element in A. 


EXAMPLE 3.5.4 


An Ill-Conditioned Matrix A common example of an ill-conditioned ma¬ 
trix is the Hilbert matrix, H,j = (i + j — 1) _1 . The Hilbert matrix of order 4, 
H 4 , is encountered in a least-squares fit of data to a third-degree polynomial. 
We have 


/I I I i\ 

± 9 S d 


h 4 = 


vi i i i/ 

\ 4 5 6 7 / 

The elements of the inverse matrix (order n) are given by 


, H _n _ (~1 j +j ( w +i-l)!(n+j-l)! 

\ ™ kj i + j-i [(i _ l)!(j - i)!]2( n - i)\(n - j)l ' 
For n — 4 


/ 

16 

-120 

240 

—140\ 


-120 

1200 

-2700 

1680 


240 

-2700 

6480 

-4200 

V 

-140 

1680 

-4200 

2800 / 


(3.179) 


(3.180) 


(3.181) 


From Eqs. (3.179), (3.181), the Turing estimate of the condition number for H 4 
becomes 


^Turing = 4 X 1 x 6480 = 2.59 X 10 4 . ■ 


This is a warning that an input error may be multiplied by 26,000 in the cal¬ 
culation of the output result. It is a statement that H 4 is ill-conditioned. If you 
encounter a highly ill-conditioned system, you have two alternatives (besides 
abandoning the problem): 

(a) Try a different mathematical attack. 

(b) Arrange to carry more significant figures and push through by brute force. 

As previously seen, matrix eigenvector-eigenvalue techniques are not lim¬ 
ited to the solution of strictly matrix problems. 


Functions of Matrices 


Polynomials with one or more matrix arguments are well defined and oc¬ 
cur often. Power series of a matrix may also be defined provided the series 


26 Compare Todd, J., The Condition of the Finite Segments of the Hilbert Matrix, Applied Math¬ 
ematics Series No. 313. National Bureau of Standards, Washington, DC. 
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EXAMPLE 3.5.5 


converge (see Chapter 5) for each matrix element. For example, if A is any 
n x n matrix, then the power series 


OO 1 

exp(A) = £-A‘, 

4=0 

(3.182a) 


(3.182b) 


(3.182c) 


are well-defined n x n matrices. For the Pauli matrices a k the Euler identity 
for real 9 and k — 1, 2, or 3 

exp(ia k 9) = 1 2 cos 0 + ia k sin 9, (3.183) 


follows from the Mclaurin series for the exponential collecting all even and 
odd powers of 9 in an even and odd power series using er| = 1: 


o ia kB 


= £ -Xmey = £ 


OO j2 m 


n= 0 


r'n (2m): 


( a k f m 9 2m 


m= 0 


£ 


fim+l 


2m+l/i2m+l 


rb ( 2m + !) 


-(?kT m+L e 


= £ 


(_l)m 

rb ( 2 ™) ! 


3 2m 


■ £ 


(_l)m 

rb (2m + 1)! ( 


32 m+l 


m= 0 


Exponential of a Diagonal Matrix If the matrix A is diagonal, such as <r 3 = 
(J _°), then its nth power is also diagonal, with its diagonal matrix elements 
raised to the nth power: (rr-j)" = ( 0 (1 ^). Then summing the exponential series 
element for element yields 

°\/ e °\ 

V 0 1° 

If we write the general diagonal matrix as A = [on, « 2 , ■ ■ ■, a, n ] with diago¬ 
nal elements aj, then A m = [a™, a™, • • •, a™], and summing the exponentials 
elementwise again we obtain e A = [e" 1 , e" 2 , ..., e re " ]. ■ 

For a Hermitian matrix A there is a unitary matrix U that diagonalizes it; that 
is, UAlT = [o \, a 2 ,..., a H ]. Then the trace formula 


det(exp(A)) = exp(trace (A)) 


(3.184) 
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SUMMARY 


is obtained (see Exercises 3.5.2 and 3.5.9) from 

det(exp(A)) = det(U exp(A)U t ) = det(exp(UAU*)) 

= det(exp[ai, a- 2 , ■ ■ ■, a n ] — det[e ai , e® 2 ,..., e®"] 

= 0 — e x P ^ ^ a-ij = exp(trace(A)), 

using UA' lJt = (UAU )' in the power series Eq. (3.182a) for expfUAU ) and the 
product theorem for determinants in Section 3.2. 

Another important relation is the Baker-Hausdorff formula 

exp(zG)H exp(-iG) = H + [iG, H] + J[iG, [iG, H]] H-, (3.185) 

which follows from multiplying the power series for exp(iG) and collecting 
the terms with the same powers of iG. Here we define 

[G, H] = GH - HG 

as the commutator of G and H. 

The preceding analysis has the advantage of exhibiting and clarifying con¬ 
ceptual relationships in the diagonalization of matrices. However, for matrices 
larger than 3 x 3, or perhaps 4x4, the process rapidly becomes so cumbersome 
that we turn gratefully to computers and iterative techniques. 27 One such tech¬ 
nique is the Jacobi method for determining eigenvalues and eigenvectors of 
real symmetric matrices. This Jacobi technique and the Gauss-Seidel method 
of solving systems of simultaneous linear equations are examples of relaxation 
methods. They are iterative techniques in which the errors will, it is hoped, 
decrease or relax as the iterations continue. Relaxation methods are used 
extensively for the solution of partial differential equations. 

The diagonalization of a real symmetric matrix rotates the quadratic form 
that it defines to its principal axes. A unitary matrix achieves the same for a 
Hermitian matrix in a complex space. The eigenvectors define the principal 
axes and the eigenvalues the principal moments. Eigenvalue problems occur 
in classical mechanics in the search for the normal modes of oscillators and 
in quantum mechanics as solutions of the Schrodinger equation. 

EXERCISES 

3.5.1 (a) Starting with the orbital angular momentum of the ith element of 
mass, 

Li = Yi x Pi = miXi x (cj x Tj), 
derive the inertia matrix such that L = \uj, |L) = I |ca). 


2 ‘ In higher dimensional systems the secular equation may be strongly ill-conditioned with respect 
to the determination of its roots (the eigenvalues). Direct solution by computer may be very 
inaccurate. Iterative techniques for diagonalizing the original matrix are usually preferred. See 
Sections 2.7 and 2.9 of Press et al, loc. cit. 
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(b) Repeat the derivation starting with kinetic energy 

Ti = ^TO*(u> x r*) 2 (V = i(w|l|w>^ . 

3.5.2 Show that the eigenvalues of a matrix are unaltered if the matrix is 
transformed by a similarity transformation. This property is not limited 
to symmetric or Hermitian matrices. It holds for any matrix satisfying 
the eigenvalue equation [Eq. (3.139)]. If our matrix can be brought 
into diagonal form by a similarity transformation, then two immediate 
consequences are 

(1) The trace (sum of eigenvalues) is invariant under a similarity trans¬ 
formation. 

(2) The determinant (product of eigenvalues) is invariant under a sim¬ 
ilarity transformation. 

Note. Prove this separately (for matrices that cannot be diagonalized). 
The invariance of the trace and determinant are often demonstrated by 
using the Cayley-Hamilton theorem: A matrix satisfies its own charac¬ 
teristic (secular) equation. 

3.5.3 As a converse of the theorem that Hermitian matrices have real eigen¬ 
values and that eigenvectors corresponding to distinct eigenvalues are 
orthogonal, show that if 

(a) the eigenvalues of a matrix are real and 

(b) the eigenvectors satisfy rtr ; = = (r, |ry), 

then the matrix is Hermitian. 

3.5.4 Show that a real matrix that is not symmetric cannot be diagonalized 
by an orthogonal similarity transformation. 

Hint. Assume that the nonsymmetric real matrix can be diagonalized 
and develop a contradiction. 

3.5.5 The matrices representing the angular momentum components ■/,, J y , 
and J z are all Hermitian. Show that the eigenvalues of J 2 , where J 2 = 
Jj + Jy + J z , are real and nonnegative. 

3.5.6 A has eigenvalues A.$ and corresponding eigenvectors |x,}. Show that 
A -1 has the same eigenvectors but with eigenvalues . 

3.5.7 A square matrix with zero determinant is labeled singular. 

(a) If A is singular, show that there is at least one nonzero column 
vector v such that 

A|v> = 0. 

(b) If there is a nonzero vector |v) such that 

A|v> = 0, 

show that A is a singular matrix. This means that if a matrix (or 
operator) has zero as an eigenvalue, the matrix (or operator) has 
no inverse and its determinant is zero. 
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3.5.8 The same similarity transformation diagonalizes each of two matrices. 
Show that the original matrices must commute. [This is particularly 
important in the matrix (Heisenberg) formulation of quantum mechan¬ 
ics.] 

3.5.9 Two Hermitian matrices A and B have the same eigenvalues. Show that 
A and B are related by a unitary similarity transformation. 

3.5.10 Show that the inertia matrix for a single particle of mass m at (x, y, z) 
has a zero determinant. Explain this result in terms of the invariance 
of the determinant of a matrix under similarity transformations (Exer¬ 
cise 3.3.10) and a possible rotation of the coordinate system. 

3.5.11 Unit masses are at the eight corners of a cube (±1, ±1, ±1). Find the 
moment of inertia matrix and show that there is a triple degeneracy. 
This means that as far as moments of inertia are concerned, the cubic 
structure exhibits spherical symmetry. 


3.5.12 Find the eigenvalues and corresponding orthonormal eigenvectors of 
the following matrices (as a numerical check, note that the sum of 
the eigenvalues equals the sum of the diagonal elements of the original 
matrix; Exercise 3.3.9). Note also the correspondence between det A = 
0 and the existence of X — 0, as required by Exercises 3.5.2 and 3.5.7: 


3.5.13 


3.5.14 


1 

0 

1\ 


0 

1 

0 

ANS. X = 0,l,2 

1 

0 

V 



( 5 

0 

V3\ 


0 

3 

0 

ANS. 1 = 2, 3, 6 

vV3 

0 

3 ) 



3.5.15 


A = 


A = 


/ 1 

•v/2 

0\ 

V2 

0 

0 

V 0 

0 

0/ 


/I 

1 

1\ 

1 

1 

1 

V 1 

1 

V 


ANS. X = -1, 0, 2. 


ANS. X = 0, 0, 3. 


3.5.16 Write x 2 + 2xy+ 2yz+ z 2 as a sum of squares a/ 2 — y' 2 + 2z' 2 in a rotated 
coordinate system. 
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3.5.17 Describe the geometric properties of the surface 

x 2 + 2 xy + 2 y 2 + 2 xz + z 2 — 1. 


How is it oriented in three-space? Is it a conic section? If so, which 
kind? 

3.5.18 Show that every 2x2 matrix has two eigenvectors and correspond¬ 
ing eigenvalues. The eigenvectors are not necessarily orthogonal or 
different. The eigenvalues are not necessarily real. 

3.5.19 As an illustration of Exercise 3.5.18, find the eigenvalues and corre¬ 
sponding eigenvectors for 



Note that the eigenvectors are not orthogonal. 


ANS. X l = 0, (nl = (2, -1); 
^2 = 4, <r 2 | = (2, 1). 


3.5.20 If A is a 2 x 2 matrix, show that its eigenvalues X satisfy the equation 


X 2 — X trace(A) + det A = 0. 


3.5.21 Assuming a unitary matrix U to satisfy an eigenvalue equation Ur = Xr, 
show that the eigenvalues of the unitary matrix have unit magnitude. 
This same result holds for real orthogonal matrices. 

3.5.22 Since an orthogonal matrix describing a rotation in real three- 
dimensional space is a special case of a unitary matrix, such an or¬ 
thogonal matrix can be diagonalized by a unitary transformation. 

(a) Show that the sum of the three eigenvalues is 1 + 2 cos <p, where <p 
is the net angle of rotation about a single fixed axis. 

(b) Given that one eigenvalue is 1, show that the other two eigenvalues 
must be e" f and e~ 1<p . 

Our orthogonal rotation matrix (real elements) has complex eigenval¬ 


ues. 


3.5.23 A is an nth-order Hermitian matrix with orthonormal eigenvectors |Xj) 
and real eigenvalues X\ < X 2 < X 3 < ■ ■ ■ < X n . Show that for a unit 
magnitude vector |y), 


*1 < (y|A|y) < X n . 


3.5.24 A particular matrix is both Hermitian and unitary. Show that its eigen¬ 
values are all ±1. 

Note. The Pauli matrices are specific examples. 

3.5.25 A has eigenvalues 1 and —1 and corresponding eigenvectors (J) and 
(j’j. Construct A. 
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3.5.26 A non-Hermitian matrix A has eigenvalues A.,; and corresponding eigen¬ 
vectors |u,;). The adjoint matrix A has the same set of eigenvalues but 
different corresponding eigenvectors, | v.;) . Show that the eigenvectors 
form a biorthogonal set in the sense that 


(v>;|Uj) = 0 for X* ^ Xj. 


3.5.27 An n x n matrix A has n eigenvalues A,. If B = e A show that B has the 
same eigenvectors as A, with the corresponding eigenvalues B, given 
by B ; = exp(Ai). 


i\ote. e is aetinea ny tne 




-! + a +^ + ¥ 


3.5.28 A matrix P is a projection operator satisfying the condition 

P 2 = P. 


Show that the corresponding eigenvalues (p 2 )-, and p } satisfy the 
relation 

(P 2 )x = (Px? = Px- 

This means that the eigenvalues of P are 0 and 1. 


3.5.29 In the matrix eigenvector-eigenvalue equation 

A|r*) = Xi\ri), 


A is an nx n Hermitian matrix. For simplicity, assume that its n real 
eigenvalues are distinct, X\ being the largest. If |r) is an approximation 
to |ri), 

n 

|r> = |ri) + ^5i|ri>, 

i=2 


show that 


(r|A|r) 

(r|r> 


< Ai 


and that the error in X i is of the order | <5i | 2 . Take <5/ <C 1. 

Hint. The n |r^) form a complete orthogonal set spanning the n- 
dimensional (complex) space. 


3.5.30 Two equal masses are connected to each other and to walls by springs 
as shown in Fig. 3.11. The masses are constrained to stay on a horizontal 
line. 


Figure 3.11 

Coupled Harmonic 
Oscillators 
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(a) Set up the Newtonian acceleration equation for each mass. 

(b) Solve the secular equation for the eigenvectors. 

(c) Determine the eigenvectors and thus the normal modes of motion. 


Additional Reading 
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Reprinted, Greenwood (1983). A readable introduction to determinants 
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Barnett, S. (1990). Matrices: Methods and Applications. Clarendon, Oxford. 

Bickley, W. G., and Thompson, R. S. H. G. (1964). Matrices—Their Meaning 
and Manipulation. Van Nostrand, Princeton, NJ. A comprehensive ac¬ 
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Gilbert, J., and Gilbert, L. (1995). Linear Algebra and Matrix Theory. Aca¬ 
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mechanics, electromagnetism, special relativity, and quantum mechanics. 

Vein, R., and Dale, P. (1998). Determinants and Their Applications in Math¬ 
ematical Physics. Springer, Berlin. 

Watkins, D. S. (1991). Fundamentals of Matrix Computations. Wiley, New 
York. 




Group Theory 


Disciplined judgment, about what is neat 
and symmetrical and elegant, has time and 
time again proved an excellent guide to 
how nature works. 

—Murray Gell-Mann 


4.1 Introduction to Group Theory 


In classical mechanics the symmetry of a physical system leads to conserva¬ 
tion laws. Conservation of angular momentum (L) is a direct consequence of 
rotational symmetry, which means invariance of some physical observables 
(such as L 2 ) or geometrical quantities (such as length of a vector or distance be¬ 
tween points) under spatial rotations. In the first third of this century, Wigner 
and others realized that invariance was a key concept in understanding the 
new quantum phenomena and in developing appropriate theories. For exam¬ 
ple, Noether’s theorem establishes a conserved current from an invariance 
of the Lagrangian of a field theory. Thus, in quantum mechanics the concept 
of angular momentum and spin has become even more central. Its generaliza¬ 
tions, isospin in nuclear physics and the flavor symmetry in particle physics, 
are indispensable tools in building and solving theories. Generalizations of the 
concept of gauge invariance of classical electrodynamics to the isospin sym¬ 
metry lead to the electroweak gauge theory. 

In each case the set of these symmetry operations forms a group, a math¬ 
ematical concept we shall soon define. Group theory is the mathematical tool 
to treat invariants and symmetries. It brings unification and formalization of 
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principles such as spatial reflections, or parity, angular momentum, and geom¬ 
etry that are widely used by physicists. 

In geometry the fundamental role of group theory was recognized more 
than a century ago by mathematicians (e.g., Felix Klein’s Erlanger Programm). 
In Euclidean geometry the distance between two points, the scalar product of 
two vectors or metric, does not change under rotations or translations. These 
symmetries are characteristic of this geometry. In special relativity the metric, 
or scalar product of four-vectors, differs from that of Euclidean geometry in 
that it is no longer positive definite and is invariant under Lorentz transforma¬ 
tions. 

For a crystal, the symmetry group contains only a finite number of rotations 
at discrete values of angles or reflections. The theory of such discrete or finite 
groups, developed originally as a branch of pure mathematics, is now a useful 
tool for the development of crystallography and condensed matter physics. 
When the rotations depend on continuously varying angles (e.g., the Euler 
angles of Section 3.3) the rotation groups have an infinite number of elements. 
Such continuous (or Lie) 1 groups are the topic of this chapter. 


A group G may be defined as a set of objects or, in physics usually, symmetry 
operations (such as rotations or Lorentz transformations), called the elements 
g of G, that may be combined or “multiplied” to form a well-defined product 
in G that satisfies the following conditions: 

1. If a and b are any two elements of G, then the product ab is also an element 
of G. In terms of symmetry operations, b is applied to the physical system 
before a in the product, and the product ab is equivalent to a single symmetry 
operation in G. Multiplication associates (or maps) an element ab of G with 
the pair (a, ft) of elements of G; this property is known as closure under 
multiplication. 

2. This multiplication is associative, ( ab)c = a(bc ). 

3. There is a unit or identity element 2 1 in G such that la — al — a for every 
element am G. 

4. G must contain an inverse or reciprocal of every element a of G, labeled 
a -1 such that aa ~ 1 = a _1 a = 1. 

Note that the unit is unique, as is the inverse. The inverse of 1 is 1 because 
la = al — a for a = 1 yields 1 • 1 = 1. If a second unit 1' existed we would 
have 11' = l'l = F and l'l = 11' = 1. Comparing we see that F = 1. Similarly, 
if a second inverse a'” 1 existed we would have a~ l a — aa~ l — 1 = aa'~ l . 
Multiplying by a -1 , we get a -1 = a' -1 . 


Definition of Group 


1 After the Norwegian mathematician Sophus Lie. 

2 Following E. Wigner, the unit element of a group is often labeled E, from the German Einheit 
(i.e., unit) or just 1 or I for identity. 
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EXAMPLE 4.1.1 


Coordinate Rotation An example of a group is the set of counterclockwise 
coordinate rotations 


x' = 



= RO) 



/ cos <p sin ip 
V — sin cp cos <p 



(4.1) 


through an angle <p of the ^/-coordinate system to a new orientation (see 
Fig. 2.20). The product of two rotations is defined by rotating first by the angle 
<P 2 and then by <pi. According to Eqs. (3.36) and (3.37), the product of the 
orthogonal 2x2 matrices, R(<pi)R(</> 2 ), describes the product of two rotations 


/ cos cp\ 

sin pp \ i 

' COS cp 2 

sin ip 2 \ 

\ — sin ipi 

cos <p i / ’ 

v - Sin cp 2 

cos cp 2 ) 


(4.2) 

_/ cos(<pi + (P 2 ) sin(v?i+ <p 2 )\ 

V - sinO?! + cp 2 ) cos(v?i + <P2)J’ 

using the addition formulas for the trigonometric functions. The product is 
clearly a rotation represented by the orthogonal matrix with angle q>\ + q> 2 . The 
product is the associative matrix multiplication. It is commutative or Abelian 
because the order in which these rotations are performed does not matter. The 
inverse of the rotation with angle cp is that with angle —< p. The unit corresponds 
to the angle cp — 0. The group’s name is SO(2), which stands for special 
orthogonal rotations in two dimensions, where special means the 2x2 
rotation matrices have determinant +1, and the angle <p varies continuously 
from 0 to 2n, so that the group has infinitely many elements. The angle is the 
group parameter. ■ 


A subgroup G' of a group G is a group consisting of elements of G so that the 
product of any of its elements is again in the subgroup G'; that is, G' is closed 
under the multiplication of G. For example, the unit 1 of G always forms a 
subgroup of G, and the unity with angle (p — 0 and the rotation with cp = n 
about some axis form a finite subgroup of the group of rotations about that 
axis. 

If gg'g~ l is an element of G' for any g of G and g' of G', then G' is called 
an invariant subgroup of G. If the group elements are matrices, then the 
element gg'g~ x corresponds to a similarity transformation [see Eq. (3.108)] of 
g' in G' by an element g of G (discussed in Chapter 3). Of course, the unit 1 
of G always forms an invariant subgroup of G because glg~ x = 1. When an 
element g of G lies outside the subgroup G', then gg'g~ l may also lie outside 
G'. Let us illustrate this by three-dimensional rotations. 


EXAMPLE 4.1.2 


Similarity Transformation Rotations of the coordinates through a finite 
angle cp counterclockwise about the s-axis in three-dimensional space are de¬ 
scribed as 


= R*(<p) 


x\ 

/ 

COS ip 

sin^> 

°\ 

/ X 

y N 


- sin q> 

cos <p 

0 

[y 

*/ 


0 

0 

V 



(4.3) 
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which form a group by a generalization of Eq. (4.2) to our special 3x3 matri¬ 
ces that keep their special form on multiplication. Moreover, the order of the 
rotations in a product does not matter, just like in Eq. (4.2), so that the group 
is Abelian. A general rotation about the .x-axis is given by the matrix 


R*0p) 


1 

0 




0 

cos y 
— sin<y 


° \ 

sin^> 

cosy ) 


Now consider a rotation R, by 90° about the x-axis. Its matrix is 


and its inverse is 


/1 0 0 \ 


R.r = 

0 

0 

1 



-1 



/I 

0 

0 ^ 


0 

0 

-1 


v° 

1 

0 ) 


corresponding to the angle —90°. This can be checked by multiplying them: 
R^R- 1 = 1. Then 



(1 

0 

°\ 

/I 

0 

° \ 

/I 

0 

0 \ 


0 

0 

1 

0 

cos^ 

sin^> 

0 

0 

-1 


1° 

-1 

0/ 

V° 

— sin^) 

cos y j 

\o 

1 

0 ) 


/ COS( p 

0 - 

sirup 

\ 






0 1 0 


\ simp 0 cos y 


which is a rotation by — y about the y-axis and no longer a rotation about the 
2 -axis so that this element lies outside the subgroup of rotations about the z- 
axis, and this subgroup is not an invariant subgroup. The set of these elements 
for all y form a group called conjugate to the group of rotations about the 
2 -axis. ■ 


Orthogonal n x n matrices form the group O (n), and they form SO (ft) if their 
determinants are +1 (S stands for “special” and O for “orthogonal”), with 
elements denoted by 0, . Because 0; = 0 ; ~ 1 (see Section 3.3 for orthogonal 
3x3 matrices that preserve the lengths of vectors and distances between 
points in three-dimensional Euclidean space), we see that the product 

6To 2 = OA = Oz L 0r 1 = (OiOa)- 1 

is also an orthogonal matrix in O(ft) or SO (re). The inverse is the transpose 
(orthogonal) matrix. The unit of the group is 1„. A real orthogonal re x to matrix 
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has n(n— 1)/2 independent parameters. For n = 2, there is only one parameter: 
one angle in Eq. (4.1). For n — 3, there are three independent parameters: 
for example, the three Euler angles of Section 3.3, and SO(3) is related to 
rotations in three dimensions, just as SO(2) is in two dimensions. Because 
O(n) contains orthogonal transformations with determinant —1, this group 
includes reflections of coordinates or parity inversions. Likewise, unitary nxn 
matrices form the group U (ri), and they form SI'(») if their determinants are 
+ 1. Because Llj = U,” 1 (see Section 3.4 for unitary matrices, which preserve 
the norm of vectors with complex components and distances between points 
in re-dimensional complex space), we see that 

(UiU 2 ) f = 14U{ = U^Uj- 1 = (UiUa)- 1 


so that the product is unitary and an element of U(re) or SU(re). Each unitary 
matrix obviously has an inverse, which again is unitary. Orthogonal matrices 
are unitary matrices that are real so that SO(n) forms a subgroup of SU(re), as 
does O(n) of U(re). 


EXAMPLE 4.1.3 


Simple Unitary Groups The phase factors e '°, with real angle 9, of quan¬ 
tum mechanical wave functions form a group under multiplication of complex 
numbers because the phase angles add on multiplying e 1 ' 1 ' e l/>2 = e l((h +0z) . More¬ 
over, (e l(> Y = e~ l6 and e'° er'' 0 = 1 show unitarity and inverse, and 9 — 0 gives 
the unit. This group (of unitary lxl matrices) has one continuous real param¬ 
eter and is therefore called U(l). The two elements ±1 form a finite (unitary) 
subgroup, and the four elements ±1, ±i form another subgroup. 

A finite unitary group of 2 x 2 matrices is defined by the two-dimensional 
unit matrix and one of the three Pauli matrices, cr;, using matrix multiplication. 
Because of — 1 2 , the inverse a- 1 = a, and 1 = 1 2 . ■ 


When a potential has spherical symmetry we choose polar coordinates, and 
the associated group of transformations is a rotation group. For problems with 
spin (or other internal properties such as isospin or flavor), unitary groups play 
a similar role. Therefore, in the following we discuss only the rotation groups 
SO(re) and the unitary group SU(2) among the classical Lie groups. 


Biographical Data 

Lie, Sophus. Lie, who was bom in 1842 in Nordfjordeid, Norway, and died 
in 1899 in Kristiana (now Oslo), started his analysis of continuous groups of 
transformations in Paris and continued it throughout his life. 

Wigner, Eugen Paul. Wigner, who was bom in 1902 in Budapest, Hungary, 
and died in 1995 in Princeton, New Jersey, studied in Berlin, moved to the 
United States in the 1930s, and received the Nobel prize in 1963 for his con¬ 
tributions to nuclear theory and applications of fundamental principles of 
symmetry, such as the charge independence of nuclear forces. He developed 
the unitary representations of the Lorentz group. 
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Homomorphism and Isomorphism 


There may be a correspondence between the elements of two groups: one-to- 
one, two-to-one, or many-to-one. If this correspondence preserves the group 
multiplication, the two groups are homomorphic. If the correspondence is 
one-to-one, still preserving the group multiplication, 3 then the groups are iso¬ 
morphic. An example is the rotations of the coordinates through an angle 
(p counterclockwise about the 2 -axis in three-dimensional space described 
by Eq. (4.3). If we identify every rotation with its orthogonal 2x2 subma¬ 
trix, we have a one-to-one map that preserves the group multiplication ac¬ 
cording to Eq. (4.2), an isomorphism or faithful representation. Of course, 
we can also identify each rotation with its 3 x 3 matrix in Eq. (4.3), which 
is also an isomorphism. Therefore, matrix representations are by no means 
unique, but each isomorphism is a group multiplication preserving faithful 
map. 


Matrix Representations: Reducible and Irreducible 


The isomorphisms between rotations and groups of matrices just discussed are 
examples of matrix representations of a group (of rotations). Such represen¬ 
tations of group elements by matrices are a very powerful technique and have 
been almost universally adopted by physicists. The use of matrices imposes 
no significant restriction. It can be shown that the elements of any finite group 
and of the continuous groups of Sections 4.2-4.4 may be represented by ma¬ 
trices. In each case, matrix multiplication follows from group multiplication. 
The same group can be represented by matrices of different rank. Examples 
are the rotations about the 2 -axis described in Eqs. (4.1) and (4.3). 

To illustrate how matrix representations arise from a symmetry, consider 
the time-independent Schrodinger equation (or some other eigenvalue equa¬ 
tion, such as \vi = Im for the principal moments of inertia of a rigid body in 
classical mechanics, for example) 


Hx/s = Ex//. 


(4.4) 


Let us assume that Eq. (4.4) stays invariant under a group G of transformations 
R in G. For example, for a spherically symmetric Hamiltonian H the group G 
would be SO(3). Consequently, 11 is the same in a rotated coordinate sys¬ 
tem, where the Hamiltonian is given by the similarity transformation R7JR” 1 
according to Eq. (3.108) of Chapter 3. Hence, 

RHR- 1 = H, or R H = HR, (4.5) 

that is, “rotations” from G and H commute. Now take a solution xjt of Eq. (4.4) 
and “rotate” it with R, an element from G: xj/ -* R xj/. Then Ri// has the same 


3 Suppose the elements of one group are labeled gi, and the elements of a second group are labeled 
hi. Then gi -o- hi is a one-to-one correspondence for all values of i. If gigj = gic and hihj = h a, 
then <%- and hj, must be the corresponding group elements. 
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value of energy E because multiplying Eq. (4.4) by R and using Eq. (4.5) 
yields 


RHt/r = £(Ri/0 = R</0- (4.6) 

In other words, all rotated solutions Ri// are degenerate in energy or form a 
vector space that physicists call a multiplet. For example, the spin-up and 
-down states of an electron form a doublet, and the states with projection 
quantum numbers m= —l, —1 + 1 , .. ., 0, 1 , ..., l of orbital angular momentum 
l form a multiplet with 21 + 1 basis states. (The magnetic field in the Zeeman 
effect lifts the degeneracy of these states and breaks the rotational symmetry 
because the magnetic field points in some direction.) 

Let us now assume that this vector space V ^ of transformed solutions has 
a finite dimension n. Let \j /\, A, ..., i/ / „ be a basis. Since Ri/rj is a member of 
the multiplet, we can expand it in terms of its basis, 


r tj=y.r^k. 

k 


(4.7) 


Thus, with each transformation R in G we can associate a matrix r = (?'//,-), 
and this map R -* (r) is defined as a representation of G. If we can take any 
element of V,i, and by rotating with all elements R of G transform it into all 
other elements of Vf, then the representation is irreducible. The spin-up and 
-down states of the hydrogen ground state form an irreducible representation 
of SU(2) (they can be rotated into each other), but the 2s state and 2 p states of 
principal quantum number n = 2 of the hydrogen atom have the same energy 
(i.e., they are degenerate) and form a reducible representation because the 2s 
state cannot be rotated into the 2 p states and vice versa (angular momentum 
is conserved under rotations). With this case in mind, we see that if all ele¬ 
ments of Vf are not reached, then V,i, splits into a direct sum of two or more 
vector subspaces (see Chapter 3), = Vj ® V 2 ® ..., which are mapped into 

themselves by rotating their elements (e.g., 2s—>-2s, 2p—>-2p). The direct sum 
of two vector spaces is spanned by the basis vectors of both vector spaces. In 
this case, the representation is called reducible. Then we can find a unitary 
matrix U so that 


U(r ifc )U f 


Ai 0 A 

0 f2 ■■■ 


V- 7 


(4.8) 


for all R of G and all matrices (jTjk .)• Here, ri, r 2 , ..., are matrices of lower 
dimension than (j'ji c ) that are lined up along the diagonal and the 0 are matrices 
made up of zeros, that is, r is block-diagonal. 

For example, for the 2s states of hydrogen, ri would be a unitary 2x2 
matrix 


ri = 


a b 
c d 
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SUMMARY 


If the electron spin is ignored, ri = 1 of the 2s state would be a one-dimensional 
unit matrix. We may say that the representation r has been decomposed into r = 

ri + r 2 H-along with Vf = V) ® 14 ® ■ ■ ■, where the smaller r, are irreducible. 

In any representation, each irreducible representation may be repeated a finite 
number of times, and some do not appear at all. 

The irreducible representations play a role in group theory that is anal¬ 
ogous to that of the unit vectors of vector analysis. They are the simplest 
representations—all others can be built from them. 


Often, groups occur in physics as sets of transformations of coordinates or 
symmetry transformations that leave a partial or ordinary differential equation 
unchanged, or group elements define changes of bases of some vector space. 
Therefore, matrix representations of groups are common in physics. The most 
important application in quantum mechanics is based on the fact that the 
degenerate states of multiplets in a spectrum reflect the symmetry group of 
the Hamiltonian. Multiplets correspond to representations of the symmetry 
group of a Hamiltonian. 


EXERCISES 

4.1.1 Show that an n x n orthogonal matrix has n(n — l)/2 independent pa¬ 
rameters. 

Hint. The orthogonality condition, Eq. (3.79), provides constraints. 

4.1.2 The special linear group SL(2,C) consists of all 2 x 2 matrices (with 
complex elements) having a determinant of +1. Show that such matri¬ 
ces form a group. 

Note. The SL(2,C) group can be related to the full Lorentz group in 
Section 4.4, much as the SU(2) group is related to SO(3). 

4.1.3 Show that rotations about the 2 -axis form a subgroup of SO(3). Is it an 
invariant subgroup? 

4.1.4 Show that if R, S, T are elements of a group G so that RS = T, and 
R -* (r ik ), S -> (Sj/0 is a representation according to Eq. (4.7), then 

0 / 'ik)Q>ik) = I Gk = ^ ' 7'inSnk j? 

V n J 

that is, group multiplication translates into matrix multiplication for any 
group representation. 

4.1.5 A subgroup H of G has elements hi. Let x be a fixed element of the 
original group G and not a member of H. The transform 

xhiX~ l , i = 1, 2,... 

generates a conjugate subgroup xHxr 1 . Show that this conjugate sub¬ 
group satisfies each of the four group postulates and therefore is a group. 
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A characteristic of continuous groups known as Lie groups is that the elements 
are functions of parameters having derivatives of arbitrary orders such as cos <p 
and sin <p in Eq. (4.1). This unlimited differentiability of the functions allows us 
to develop the concept of generator and reduce the study of the whole group 
to a study of the group elements in the neighborhood of the identity element. 

Lie’s essential idea was to study elements R in a group G that are infinites¬ 
imally close to the unity of G. Let us consider the SO(2) group as a simple 
example. The 2x2 rotation matrices in Eq. (4.1) can be written in exponential 
form using the Euler identity [Eq. (3.183)] as 

/ cos <p sin <p \ 

R(^) = = 1 2 cos (p + ier 2 sin<p = exp(?,er 2 ip). (4.9) 

V — sin (p cos i -p ) 

From the exponential form it is obvious that multiplication of these matrices 
is equivalent to addition of the arguments 

R(<p 2 )R(<pi) = exp(i<T2^ 2 ) exp(ier 2 <Pi) = exp(iu 2 (<pi + <p 2 )) = R(pi + <p 2 ). 

Of course, the rotations close to 1 have small angle <p « 0. 

This suggests that we look for an exponential representation 


R = exp(ieS), s -> 0, 


(4.10) 


for group elements R in G close to the unity 1. The operators S of the infinites¬ 
imal transformations if S are called generators of G. Therefore, er 2 in Eq. (4.9) 
is the generator of rotations about the z-axis. Thus, for SO(2) as defined by 
Eq. (4.1) there is only one linearly independent generator, a 2 . In SO(3) there is 
a generator for rotations about each axis. These generators form a linear space 
because multiplication of the elements R of the group translates into addition 
of generators S; its dimension is defined as the order of G. Therefore, the or¬ 
der of SO(2) is 1, and it is 3 for SO(3). One can also show that the commutator 
of two generators is again a generator 


[Sj, S k ] — i Cj k Si 


where the c’s are defined as the structure constants of the group. The vector 
space of generators can be endowed with a multiplication by defining the 
commutator as the product of two generators. This way the vector space of 
generators becomes an algebra, the so-called Lie algebra. 

Because R does not change the volume—that is, det(R) = 1—we use Eq. 
(3.184) to see that 


det(R) = exp(trace(lnR)) = exp(ietrace(S)) = 1, 

which implies that generators are traceless: 

tr(S) = 0. 


(4.11) 


This is the case for the rotation groups SO(n) and unitary groups St'(n). 
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If R of G in Eq. (4.10) is unitary, then S' = .S'is Hermitian, which is also the 
case for SO (re) and SU(re). Hence the i in Eq. (4.10). 

Returning to Eq. (4.5), we now emphasize the most important result from 
group theory. The inverse of R is just R 1 = cxpf—ieS). We expand Hr accord¬ 
ing to the Baker-Hausdorff formula [Eq. (3.185)]; taking the Hamiltonian H 
and S to be operators or matrices we see that 

H = H r = exp(isS)//exp(—feS) = H+ie[ S, H]-\s 2 [ S[S, H]\+- ■■ . (4.12) 

Li 

We subtract H from Eq. (4.12), divide by e, and let e —> 0. Then Eq. (4.12) 
implies that for any rotation close to 1 in G the commutator 


[S, H] = 0. 


(4.13) 


We see that S is a constant of the motion: A symmetry of the system 
has led to a conservation law. If S and H are Hermitian matrices, Eq. (4.13) 
states that S and H can be simultaneously diagonalized; that is, the eigenvalues 
of S are constants of the motion. If S and H are differential operators like the 
Hamiltonian and orbital angular momentum L 2 , L z in quantum mechanics, 
then Eq. (4.13) states that S and H have common eigenfunctions, and that the 
degenerate eigenvalues of H can be distinguished by the eigenvalues of the 
generators S. These eigenfunctions and eigenvalues, s, are solutions of separate 
differential equations, Si//.s = st/r s , so that group theory (i.e., symmetries) leads 
to a separation of variables for a partial differential equation that is invariant 
under the transformations of the group. For examples, see the separation of 
variables method for partial differential equations in Section 8.9 and special 
functions in Chapter 11. This is by far the most important application of group 
theory in quantum mechanics. 

In the following sections, we study orthogonal and unitary groups as ex¬ 
amples to understand better the general concepts of this section. 


P Rotation Groups S0(2) and S0(3) 


For SO(2) as defined by Eq. (4.1) there is only one linearly independent gener¬ 
ator, <72, and the order of SO(2) is 1. We get a-> from Eq. (4.9) by differentiation 
at the unity of SO(2) (i.e., ip = 0), 



(4.14) 


For the rotations R,(y/) about the 2 -axis described by 3 x 3 matrices in 
Eq. (4.3), the generator is given by 


/0 -i 0\ 


.dR z (<p) 


= S 2 = i 0 0 

v 0 0 0y 


i 


dip 


(4.15) 
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Here, 02 is recognized in the upper left-hand corner of S 2 . The rotation R 2 (<5^>) 
through an infinitesimal angle Sep may then be expanded near the unity (ep = 0) 
as 


RzOty) =13 + iS(pS z , (4.16) 

with terms of order (Sep) 2 and higher omitted. A finite rotation R(y/) may be 
compounded of successive infinitesimal rotations 


FU-tyi + Sep 2 ) = (1 3 + fi5<piS 2 )(l 3 + iSepiSz). (4.17) 

Let Sep = ep/N for N rotations, with N —>- 00 . Then, 

= lim [R,(<p/m N = lim [1 3 + (vp/ N)S z f = exp(^S 2 ), (4.18) 

AT—>-OC N^-OO 


which is another way of getting Eq. (4.10). This form identifies S 2 as the gen¬ 
erator of the group R 2 , an Abelian subgroup of SO(3), the group of rotations in 
three dimensions with determinant +1. Each 3x3 matrix R 2 (^>) is orthogonal 
(hence unitary), and trace(S 2 ) = 0 in accordance with Eq. (4.11). 

By differentiation of the coordinate rotations 



(1 0 

0 \ 



<cosd 0 

— sin 9 ^ 

Rx(V f ) = 

0 cos 1 [/ 

sin 1 jr 

J 

R/0) = 

0 1 

0 


^ 0 — sin ejf cos i/f ) 



^ sin 6 0 

cos 9 ) 

we get the generators 








/0 0 0 \ 


O 

O 

<S>. 



s — 

0 0- 

i 

> — 

0 0 0 




O 

•«s> 

O 


1 

<s>. 

0 

0 



(4.19) 


(4.20) 


of R x (R.v), the subgroup of rotations about the x- (yy-)axis. 


Rotation of Functions and Orbital Angular Momentum 


In the foregoing discussion the group elements are matrices that rotate the 
coordinates. Any physical system being described is held fixed. Now let us 
hold the coordinates fixed and rotate a function 1 // (x, y, z) relative to our fixed 
coordinates. With R to rotate the coordinates, 


x' = Rx, 


(4.21) 


we define R on \jr by 


R if(x, y, z) = ir'(x, y, z) = ^(x'). (4.22) 

In words, R operates on the function 1 //, creating a new function \j/' that 
is numerically equal to i//(x'), where x' are the coordinates rotated by R. If 
R rotates the coordinates counterclockwise, the effect of R is to rotate the 
pattern of the function \j/ counterclockwise, as shown in Fig. 2.20. 
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Returning to Eqs. (4.3), (4.15), and (4.20) consider an infinitesimal rotation 
again, ip —»■ Sep. Then, using R- [Eq. (4.3)], we obtain 

R z (8ip^(x, y, z ) = f(x + yS(p, y - x8ip, z). (4.23) 

The right side may be expanded to first order in 8 <p to give 

R 2 (<8 ip)f(oc, y, z) = f(x, y, z) - 8 ip{xdf/dy- ydf/dx] + 0(8ip) 2 

= (1 - iSipL^Qx, y, z), (4.24) 

where the differential expression in curly brackets is the orbital angular mo¬ 
mentum iL z (Exercise 1.7.12). This shows how the orbital angular momentum 
operator arises as a generator. Since a rotation of first ip and then 8 ip about the 
2 -axis is given by 


R*0 'P + &<P)'I' = R z (8(p}R z (ip^ = (1 - i8(pL z )R z (<p')\jr, 


we have (as an operator equation) 

dR z R z (<p + 8<p) - R z (<p) 

-= lim -- 

dip Sip^o 8ip 

In this form, Eq. (4.26) integrates immediately to 


iL z R z (ip). 


Rz(<p) = exp {-iipLz). 


(4.25) 

(4.26) 

(4.27) 


Note that Rz(^) rotates functions (counterclockwise) relative to fixed coordi¬ 
nates [so Eqs. (4.27) and (4.10) are similar but not the same] and that L, is the 
^-component of the orbital angular momentum L. The constant of integration 
is fixed by the boundary condition R z (0) = 1. 

If we recognize that the operator 


L z = (x, y, z~)S z 


(d/dx\ 

3 / 3 y , 


\ 3 / 3 z ) 


(4.28) 


it becomes clear why L x , L y , and L z satisfy the same commutation relation 


[Lu Lj] — isijkLf; (4.29) 

as S r , S y , and S, and yield the structure constants of SO(3). 


P Special Unitary Group SU(2) 

Since unitary 2x2 matrices transform complex two-dimensional vectors pre¬ 
serving their norm, they represent the most general transformations of (a basis 
in the Hilbert space of) spin ' wave functions in nonrelativistic quantum me¬ 
chanics. The basis states of this system are conventionally chosen to be 


I t> = 



u> = 



corresponding to spin | up and down states, respectively. We can show that the 
special unitary group SU(2) of such unitary 2x2 matrices with determinant 
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+ 1 has the three Pauli matrices rr,- as generators. Therefore, we expect SU(2) 
to be of order 3 and to depend on three real continuous parameters £, rj, f, 
which are often called the Cayley-Klein parameters and are essentially the 
SU(2) analog of Euler angles. We start with the observation that orthogonal 
2x2 matrices [Eq. (4.1)] are real unitary matrices, so they form a subgroup of 
SU(2). We also see that 

/ e“ 0 \ 

V 0 e"“ ) 


is unitary for real angle a with determinant +1. Therefore, these simple and 
manifestly unitary matrices form another subgroup of SU( 2 ) from which we 
can obtain all elements of SU( 2 )—that is, the general 2 x 2 unitary matrix 
of determinant +1. For a two-component spin ) wave function of quantum 
mechanics, this diagonal unitary matrix corresponds to multiplication of the 
spin-up wave function with a phase factor e lu and the spin-down component 
with the inverse phase factor. Using the real angle 77 instead of <p for the rotation 
matrix and then multiplying by the diagonal unitary matrices, we construct a 
2x2 unitary matrix that depends on three parameters and is clearly a more 
general element of SU( 2 ): 

/ e m 0 \ / cos 11 sin 77 \ / e l/3 0 \ 

\ 0 e~ m ) \ — sin 77 cos 77 ) \ 0 e~ l P ) 

/ e m cos 77 e la sin 77 \ / e lfi 0 \ 

V — e~ m sin i] e~ w cos 77 / \ 0 e~ l P ) 

/ e *(“+^) cos ^ s j n ji \ 

V — sin 77 e - l( “ +/? ) cos 77 / 


Defining a+ = ^, a —p = ^, we have in fact constructed the general element 

of SU(2): 


U2G, v, O = 


/ cos 77 
\ — sin 77 


e lf sin 77 
e ~^ cos r] 



(4.30) 


where |a | 2 + \b\ 2 = 1. It is easy to check that the determinant det(U 2 ) = 1 by 
the product theorem of Section 3.2 and that U 2 U 2 = 1 = U 2 U 2 holds provided 
77 , 4 T are real numbers. 

To get the generators, we differentiate 


,au 2 

■i — 

l^=o, »j=o 

,3U 2 

-1 - 

3'7 |j7=o,r=o 



(4.31a) 

(4.31b) 


To avoid a factor 1/ sin 77 for 77 - 7-0 upon differentiating with respect to £, we 
use instead the right-hand side of Eq. (4.30) for U 2 for pure imaginary b = ifi 
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with , 


—%- 




0. Differentiating such a U 2 , we get the third generator 

/-rf= i 


V1-/S 2 ip 
, iP Jl-P 2 


p=o 




f >=0 


0 1 
1 0 


-<y i- 


(4.31c) 


Being generators, these Pauli matrices are all traceless and Hermitian. A corre¬ 
spondence with the physical world is obtained if we scale the SU(2) generators 
so that they yield the angular momentum commutators. With the Pauli matrices 
as generators, the elements Ui, U 2 , U 3 of SU(2) may be generated by 


Ui = exp(—io'cri/2), U 2 = exp(-r^cr2/2), U 3 = exp(-iycr 3 /2). (4.32) 

The three parameters are real, and we interpret them as angles. The extra 
scale factor 1 /2 is present in the exponents because S,- = <j J2 satisfy the same 
commutation relations, 4 

[Sj, Sj] — isijkSfr, (4.33) 

as the orbital angular momentum in Eq. (4.29). 

Using the angular momentum matrix S 3 , we have as the corresponding 
rotation operator R,(y0 = exp(/ya 3 /2) in two-dimensional (complex wave 
frmction) space, analogous to Eq. (4.3) that gives the operator for rotating the 
Cartesian coordinates in the three-space. 

For rotating the two-component vector wave function (spinor) or a spin 
| particle relative to fixed coordinates, the rotation operator is R,(y) = 
exp(—hper 3 /2) according to Eq. (4.27). 

Using in Eq. (4.32) the Euler identity [Eq. (3.183)] we obtain 


Uy = cos(a/2) - i(Tj sin(a/2), (4.34) 

etc. Here, the parameter a appears as an angle, the coefficient of an angular 
momentum matrix-like <p in Eq. (4.27). With this identification of the exponen¬ 
tials, the general form of the SU(2) matrix (for rotating functions rather than 
coordinates) may be written as 

U(a, P, y) = exp(-iyer 3 /2) exp(--iy6er 2 /2) exp(-iacr 3 /2), (4.35) 


where the SU(2) Euler angles a, p, y differ from the a, p, y used in the defini¬ 
tion of the Cayley-Klein parameters f, rj, £ by a factor of 1/2. Further discus¬ 
sion of the relation between SO(3) and orbital angular momentum appears in 
Sections 4.3 and 11.7. 


The orbital angular momentum operators are the generators of the rotation 
group SO(3) and (1/2) the Pauli spin matrices are those for SU(2), the symmetry 
group of the Schrodinger equation for a spin / particle such as the electron. 
Generators obey commutation relations characteristic of the group. 


4 The structure constants (£ijk) lead to the SU(2) representations of dimension 2 J +1 for generators 

of dimension 2J + 1, J = 0, 1/2, 1,_The integral J cases also lead to the representations of 

SO(3). 
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EXERCISES 

4.2.1 (i) Show that the Pauli matrices are the generators of SU(2) without using 
the parameterization of the general unitary 2x2 matrix in Eq. (4.30). 
Hint. Exploit the general properties of generators. 

4.2.2 Prove that the general form of a 2 x 2 unitary, unimodular matrix is 



with a*a + b*b — 1. Based on this result, derive the parameterization of 
Eq. (4.30). 

4.2.3 A translation operator T(a) converts \j/(x) to //(x + a), 

T(a)^(x) — \[r[x+ a). 

In terms of the (quantum mechanical) linear momentum operator p x = 
— id/dx , show that T(a) = exp (iapx) (i.e., p x is the generator of transla¬ 
tions). 

Hint. Expand \fr (x + a) as a Taylor series. 

4.2.4 Consider the general SU(2) element Eq. (4.30) to be built up of three 
Euler rotations: (i) a rotation of a/2 about the 2 -axis, (ii) a rotation of 
5/2 about the new a;-axis, and (iii) a rotation of c/2 about the new 2 -axis. 
(All rotations are counterclockwise.) Using the Pauli er generators, show 
that these rotation angles are determined by 

a = £ — / -|- 7T/2 = a -|- it/2 
b = 2r, = 

C = £ + / — 7r/2 = Y — 7r/2. 

Note. The angles a and b here are not the a and b of Eq. (4.30). 

4.2.5 We know that any 2x2 matrix A can be expanded as A = clq ■ 1 + a ■ er, 
where 1 is the two-dimensional unit matrix. Determine ao and a for the 
general SU(2) matrix in Eq. (4.30). 

4.2.6 Rotate a nonrelativistic wave function i// = (i//j, ///) of spin / about the 
2 -axis by a small angle d 6 . Find the corresponding generator. 


4.3 Orbital Angular Momentum 


The classical concept of angular momentum L class = r x p is presented in 
Section 1.3 to introduce the cross product. Following the usual Schrodinger 
representation of quantum mechanics, the classical linear momentum p is 
replaced by the operator —/V. The quantum mechanical orbital angular 
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momentum operator becomes 5 

h QM = -ir x V. (4.36) 

This is used repeatedly in Sections 1.7, 1.8, and 2.4 to illustrate vector differ¬ 
ential operators. From Exercise 1.7.13 the angular momentum components 
satisfy the commutation relations 

[Li, Lj] = isijkLk- (4.37) 

The Siji c is the Levi-Civita symbol of Section 2.9. A summation over the index 
k is understood. 

The differential operator corresponding to the square of the angular mo¬ 
mentum 


L 2 = L • L = L 2 + L 2 + L 2 

(4.38) 

may be determined from 


L • L = (r x p) ■ (r x p), 

(4.39) 


which is the subject of Exercises 1.8.6 and 2.5.12(c). Since L 2 is invariant under 
rotations, [L 2 , L,\ = 0, which can also be verified directly. 

Equation (4.37) presents the basic commutation relations of the com¬ 
ponents of the quantum mechanical angular momentum. Indeed, within the 
framework of quantum mechanics and group theory, these commutation rela¬ 
tions define an angular momentum operator. We shall use them now to con¬ 
struct the angular momentum eigenstates and find the eigenvalues. For the 
orbital angular momentum, these are the spherical harmonics of Section 11.5. 

Ladder Operator Approach 

Let us start with a general approach, where the angular momentum J we 
consider may represent an orbital angular momentum L, a spin er/2, a total 
angular momentum L + er/2, etc. Thus, 

1. J is an Hermitian operator whose components satisfy the commutation 
relations 


[Ji, Jj] — iSijkJk , [J“, Ji\ — 0 - (4.40) 

Otherwise J is arbitrary. (See Exercise 4.3.1.) 

2. | km) is a normalized eigenfunction (or eigenvector) of J z with eigenvalue 
to and an eigenfunction 6 of J 2 , 

J z \Xm) = J 2 \Xm) = k\km). (4.41) 


B For simplicity, U is set equal to 1. This means that the angular momentum is measured in units 
of h. 

6 That | Am) is an eigenfunction of both J z and J 2 follows from [J z , J 2 ] = 0 in Eq. (4.40). Note also 
that eigenvalues are in small letters, whereas operators are in capitals. 
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We shall show that X = j (j + 1), with j either integral or half-integral, 
and then find other properties of the \Xm). The treatment will illustrate the 
generality and power of operator techniques, particularly the use of ladder 
operators. 7 

The ladder operators are defined as 

4- = 4 + tt/y, J. — = Jg xjy. (4.42) 

In terms of these operators J 2 may be rewritten as 

J 2 = \ (4 J- + J-J+) + 4 ■ (4.43) 

From the commutation relations, Eq. (4.40), we find 

[4 41 = +4, [4 J-] — ~J -, [4> 4] = 2 4- (4.44) 

Since J + commutes with J 2 (Exercise 4.3.1), 

J 2 (4|Xto» = 4(J 2 |Ato» = A(4|Ato)). (4.45) 

Therefore, J + \ Xm) is still an eigenfunction of J 2 with eigenvalue X, and similarly 
for J | Xm). However, from Eq. (4.44) 


44 = 4(4 + 1), (4.46) 

or 

4(44™)) = 4(4 + l)4m> = (m+ l~)J + \Xm). (4.47) 

Therefore, 44™) i s still an eigenfunction of 4 but with the eigenvalue m+1. 
4 l> as raised the eigenvalue by 1 and so is called a raising operator. Similarly, 
4 lowers the eigenvalue by 1; it is a lowering operator. 

Taking the diagonal matrix element (also called expectation value) and 
using 4 = 4, Jy = J y , we get 

(Xm\J 2 — J 2 \Xm) — ( Xm\J 2 + Jy\Xm) = |44™)| 2 + I44 m 4 > 0 

and see that X — m 2 > 0, so m is bounded, and X > 0. Let j be the largest 
m value. Then 444 = 0, which implies 4444 = 0- Hence, combining 
Eqs. (4.43) and (4.44) to get 

J 2 = 4.4 + 4(4+1), (4.48) 


we find from Eq. (4.48) that 

o = 4444 = (J 2 - - 4)44 = (x- j 2 - j)\xj). 


Therefore, 


* = JU + 1 ) > 0 . 


(4.49) 


7 Ladder operators can be developed for other mathematical functions. Compare Section 13.1 for 
Hermite polynomials. 
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We now relabel the states | A.to) = | jm). Similarly, let j be the smallest m 
value. Then J_ | jj') = 0. From 

J 2 = J + J_ + J Z (J Z - 1), (4.50) 

we see that 

0 = J + J_| jjj = (J 2 + J z - J 2 z )\jj') =(X + j' - / 2 )| jj'). (4.51) 

Hence, 

* = j(j + 1) = f(f -1) = (-i)(- j - i). 


Therefore, j' = —j, and j > 0 because j' < j. Moreover, to runs in integer 
steps from — j to j, 


-j <m< j, (4.52) 

so that 2 j must be a positive integer. Thus, j is either an integer or half of an 
odd integer. 

Starting from | jj) and applying ■/_ repeatedly, we reach all other states 
| jm). Hence, the | jm) form an irreducible representation; m varies and j is 
fixed. 

Then using Eqs. (4.40), (4.48), and (4.50) we obtain 
J-J+\jm) = [jti + 1) - m(m+ 1)]| jm) = (j - m)(j + m+ 1)| jm), 

(4.53) 

J+J-\jm) = [j{j + 1) - m(m— 1)]|j'to> = (j + m)(j - m+ 1)| jm). 

Because J + and J„ are Hermitian conjugates, 8 

4 = J_, 4 = J+, (4.54) 

the eigenvalues in Eq. (4.53) must be positive or zero. 9 This follows from 

(j'm| J_(J+|im» = (J + |jm» f J+\jm) > 0. (4.55) 

Examples of Eq. (4.53) are provided by the matrices of Exercises 3.2.9 (spin ) ), 
3.2.11 (spin 1), and 3.2.13 (spin )). For the orbital angular momentum ladder 
operators L + and L _, explicit forms are given in Exercise 2.5.10. 

Since J+ raises the eigenvalue to to m+ 1, we relabel the resultant eigen¬ 
function | jm+ 1). From Eqs. (4.47) and (4.53) we see that 

J+\jm) = J (J - m)U + to+ 1)| jm+ 1), (4.56) 

taking the positive square root and not introducing any phase factor. By the 
same arguments 

J-\jm) = + ni)(j - m+ 1 )|j'to— 1). (4.57) 


8 The Hermitian conjugation or adjoint operation is defined for matrices in Section 3.4 and for 
operators in general in Section 9.1. 

9 For an excellent discussion of adjoint operators and Hilbert space, see Messiah, A. (1961). Quan¬ 
tum Mechanics, Chapter 7. Wiley, New York. 
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EXAMPLE 4.3.1 


SUMMARY 


As shown later, orbital angular momentum is described with integral j. From 
the spins of some of the fundamental particles and of some nuclei, we get 
j = 1/2, 3/2, 5/2, ... . Our angular momentum is quantized, essentially as a 
result of the commutation relations. 


Spin | States The spin raising operator is given by 


1 1 ^ 1 / 0 

S+ — -<T+ = f(T r + l(7y) = - 
+ 2 + 2 V x yJ 2 V1 + i-i 


1 + i(-i) 
0 


o r 
o o/’ 


so that 


S+X], 


0 1W0 

0 oil 


Xf> 


which is consistent with ^j[\ — (— g)](| — \ + 1) = 1 of Eq. (4.56). 


In spherical polar coordinates 0, <p the functions (0, (p\lm) = Y^'iO, <p) are the 
spherical harmonics of Section 11.5. Similarly, we can work out Eq. (4.57) 
for the orbital angular momentum lowering operator = L x — iL v = —er lxp 
(Jj — i c:ot. 0 S ;) from Exercise 2.5.10b acting on the spherical harmonic 
{9, (p\ll) = — sin9e l<p = Yn(0, <P )• We find 


L-Yn = — e ’’’( — — icot9— )(— 1)J — sinde*'" 


90 


d<p 


8ix 


/ 3 / 6 

= e -1 '",/—(cos0 — icos9i)e l<p = J — cos0 = a/2(0, ^|1, 0) = V2 F 10 , 
V OTT V 4?r 


where V(1 + 1)(1 —1 + 1) = V2. That is, the ladder formulas (Eqs. 4.55 and 
4.56) apply to the spherical harmonics and are equivalent to using the differ¬ 
ential operators for L. 


Generators for the classical Lie groups can be organized into those defining 
additive eigenvalues and ladder operators that raise or lower these eigenvalues. 
For the rotation group SO(3) these are L z and L±. Altogether, they define the 
selection rules of a symmetry group. 


EXERCISES 

4.3. Show that (a) [J+, J 2 ] = 0, (b) [J_, J 2 ] = 0. 

4.3.2 Write down all matrix elements {j'm!\0\jm) of the angular momentum 
operator 0 = J 2 , 0 = J z , and 0 = J±. 

4.3.3 Construct matrix representations for J± and J z for angular momentum 
J = 1, 3/2, 2, 5/2. 
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4.3. Let |a, b) be a complete set of common eigenfunctions of the Hermitian 
operators A and B; that is, 

A\a, b) — a\a, b), B|a, b) — b\a, b). 

Show that [A, B] = 0. Is the inverse conclusion valid? 

4.3.5 The three 2x2 matrices o'- for j =1,2, 3 satisfy the same commutation 
relations as the Pauli matrices. Show that there is one matrix s so that 
o'- — scTjS -1 for j = 1, 2, 3. Interpret this result in your own words. 

4.3.6 Determine the eigenvalues of the orbital angular momentum operators 
L 2 and L z for the functions e r-/ "~ cos 0 and e r-/ ' r sin 0e ±Vf , where a is 
some constant length. 

4.3.7 Derive the generators of SU(3). Determine the order of SU(3). Write 
down various raising and lowering operators. 

4.3.8 Explain why the theorem of Exercise 4.3.5 does not hold for three cor¬ 
responding 3x3 generator matrices of SU(3). 


4.4 Homogeneous Lorentz Group 


Generalizing the approach to vectors of Section 2.6, in special relativity we 
demand that our physical laws be covariant 10 under 

• space and time translations, 

• rotations in real, three-dimensional space, and 

• Lorentz transformations. 

The demand for covariance under translations is based on the homogeneity of 
space and time. Covariance under rotations is an assertion of the isotropy of 
space. The requirement of Lorentz covariance follows from special relativity. 

Space rotations and pure Lorentz transformations together form the homo¬ 
geneous Lorentz group, and they form the Poincare group when translations 
are included as well. 

We first generate a subgroup—the Lorentz transformations in which the 
relative velocity v is along the x = X\ axis. The generator may be determined 
by considering space-time reference frames moving with a relative velocity <5 v, 
an infinitesimal. * 11 The relations are similar to those for rotations in real space 
(Sections 2.6 and 3.3), except that here the angle of rotation is pure imaginary. 

Lorentz transformations are linear not only in the space coordinates Xi but 
also in time t. They originate from Maxwell’s equations of electrodynamics, 


10 To be covariant means to have the same form in different coordinate systems, often called 
inertial frames, so that there is no preferred reference system (compare Section 2.6). 

11 This derivation, with a slightly different metric, appears in an article by J. L. Strecker, Am. J. 
Phys. 35, 12 (1967). 








4.4 Homogeneous Lorentz Group 


249 


which are invariant under Lorentz transformations. Lorentz transformations 
leave the quadratic form 

_ ^2 _ ™.2 _ ™.2 — rJl _ y. 2 _ ~.2 _ ™.2 

L 1/ ^ iXy3 1 tA/Q iA/ | 2 vl/3 

invariant, where ,To = ci and c is the velocity of light that is the same in all 
inertial frames. We see this invariance if we switch on a light source at the origin 
of the coordinate system. At time t light has traveled the distance ct = x i 
so that c 2 t 2 — x\ — x\ — x 2 = 0. Special relativity requires that in any inertial 
frame whose coordinates are x[ that moves with velocity v < c in any direction 
relative to the X; system and has the same origin at time t = 0, 

2 f t2 _ /2 _ t2 _ /2 _ a 

O V tAyJ 1^2 


holds also. 

Four-dimensional space-time with the metric 

9 9 9 9 9 

ry ry - ry^ - /y»“ _ ry ‘- 1 _ ry _ ry ^ 

tv tAj - tv - vl/Q tA/J^ tv2 vtzg 

is called Minkowski space, with the scalar product of two four-vectors defined 
as 


a ■ b = aob 0 — a b. 


Using the metric tensor 


( 9 /«) 


GT) = 


/I 

o 

o 




o 

-l 

o 

o 


o 

o 

-l 

o 


o\ 

0 

0 

- 1 / 


(4.58) 


we can raise and lower the indices of a four-vector, such as the coordinates 
x M = ( Xo , x), so that x ^ = g IJV x v = (xo, — x), x^g^x" = xfi — x 2 , Einstein’s 
summation convention being understood. It is a common convention to use 
Greek letters for four-vector indices and, because xo = x°, we usually write xo 
for the time variable. For the gradient that is a covariant four-vector we have 


so that 



9 2 = 



- V 2 


is a Lorentz scalar, just like the metric x 2 = Xg — x 2 . 

For v <£ c, in the nonrelativistic limit, a Lorentz transformation must be 
Galilean. Hence, to derive the form of a Lorentz transformation along the 
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X\ -axis, we start with a Galilean transformation for infinitesimal relative ve¬ 
locity Sv: 


x' 1 = x 1 — tSv — x 1 — x°Sj6. (4.59) 

Here, as usual, fi = v/c. By symmetry, we also write 

x'° = x° + aSfix 1 , (4.60) 

with a a parameter that is fixed by the requirement that 00 q 00 -y be invariant, 

Xq — xi — Xg — xf. (4.61) 

Remember that xJ‘ = (a: 0 , x) is the prototype four-dimensional vector in Min¬ 
kowski space. Thus, Eq. (4.61) is simply a statement of the invariance of the 
square of the magnitude of the “distance” vector under Lorentz transformation 
in Minkowski space. Here is where the special relativity is brought into our 
transformation. Squaring and subtracting Eqs. (4.59) and (4.60) and discarding 
terms of order (<5/l) 2 , we find a = —1. Equations (4.59) and (4.60) may be 
combined as a matrix equation 

/ x’° \ ( x° \ 

{ x «)=X-^o( xl } ( 4 . 62 ) 


where o\ happens to be the Pauli matrix op and the parameter <5/4 represents 
an infinitesimal change. Using the same techniques as in Section 4.2, we repeat 
the transformation N times to develop a finite transformation with the velocity 
parameter p = N8fi. Then 


,/0 


x 


X 


-^- 9 ) 


N 'x° 


X 


In the limit as N oo, 


lim 

N^-OO 


h 


_ 


exp(-pcri). 


(4.63) 


(4.64) 


As in Section 4.2, the exponential is expanded as a Maclaurin expansion 

1 9 1 „ 

exp(—poi) = 1 2 - pay + 4-• ( 4 - 65 ) 

Noting that of = I 2 , 


exp(— pay) = l 2 coshp — uisinhp. 


(4.66) 


Hence, our finite Lorentz transformation is 




cosh p — sinh p 
— sinh p cosh p 


(4.67) 
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oi has generated the representations of this pure Lorentz transformation. Cosh 
p and sinh p may be identified by considering the origin of the primed coordi¬ 
nate system, x n — 0, or x 1 = vt. Substituting into Eq. (4.67), we have 

0 = x 1 coshp — x° sinhp. (4.68) 

With x 1 = vt and x° = ct, 


tanh p = p — v/c. (4.69) 

Note that the rapidity p ^ v/c except in the limit as v —> 0. The rapidity is the 
additive parameter of a pure Lorentz transformation (“boost”) along some 
axis that plays the same role as the angle of a rotation about some axis. 

Using 1 - tanh 2 p = (cosh 2 p) _1 , 

coshp = (1 — /S 2 )” 1/2 = y, sinhp = fiy. (4.70) 


The preceding special case of the velocity parallel to one space axis is easy, 
but it illustrates the infinitesimal velocity-exponentiation-generator tech¬ 
nique. Now this exact technique may be applied to derive the Lorentz transfor¬ 
mation for the relative velocity v not parallel to any space axis. The matrices 
given by Eq. (4.67) for the case of v = xi>„ form a subgroup. The matrices in 
the general case do not. The product of two Lorentz transformation matrices, 
L(vi) and L(v 2 ), yields a third Lorentz matrix L(v 3 ) if the two velocities Vi and 
v 2 are parallel. The resultant velocity v 3 is related to V] and v 2 by the Einstein 
velocity addition law (Exercise 4.4.3). If Vi and v 2 are not parallel, no such 
simple relation exists. Specifically, consider three reference frames S, S', and 
S", with S and S' related by L(vi) and S' and S" related by L(v 2 ). If the veloc¬ 
ity of S" relative to the original system S is v 3 , S" is not obtained from S by 
L(v 3 ) = L(v 2 )L(vi ). Rather, one can show that 


L(v 3 ) = RL(v z )L(v 1 ), (4.71) 

where R is a 3 x 3 space rotation matrix. With Vi and v 2 not parallel, the final 
system S" is rotated relative to S. This rotation is the origin of the Thomas 
precession involved in spin-orbit coupling terms in atomic and nuclear physics. 
Because of its presence, the L(v) by themselves do not form a group. 


Vector Analysis in M in kowski Space-Time 

We have seen that the propagation of light determines the metric 


c 2 £ 2 = 0 = 


c 2 t' 2 , 


where x 1 ' = (ct, r) is the coordinate four-vector. For a particle moving with 
velocity v the Lorentz invariant infinitesimal version 

cdr = y/docNix^ = -J c 2 dt 2 — dr 2 = dtj c 2 — v 2 

defines the invariant proper time r on its track. Because of time dilation in 
moving frames, a proper time clock rides with the particle (in its rest frame) 
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and runs at the slowest possible rate compared to any other inertial frame (e.g., 
of an observer). The four-velocity of the particle can now be defined properly 
as 


dx^ „ ( c v 

dx V Vc 2 — v 2 ’ Vc 2 — v 2 

so that u 2 = 1, and the four-momentum p 1 ' = emu M — (-, p) yields Einstein’s 
famous energy relation 


E = 


me 


9 m 9 

,_-_ = TOC 2 + —V 2 ± • • • . 

yi - v 2 /c 2 2 


A consequence of u 2 — 1 and its physical significance is that the particle is on 
its mass shell p 2 =m 2 c 2 . 

Now we formulate Newton’s equation for a single particle of mass to in 
special relativity as d £- — K 1 ', with K 1 ' denoting the force four-vector, so that 
its vector part of the equation coincides with the usual form. For p = 1, 2, 3 
we use dr = dty/l — v 2 /c 2 and find 

1 rlP = F = K 

yr— v 2 /c 2 dt yx— v 2 /c 2 


determining K in terms of the usual force F. Next, we need to find K°. We 
proceed by analogy with the derivation of energy conservation, multiplying 
the force equation into the four-velocity 


du v to du 2 

mu v — = w-w- = 0, 
dx 2 dx 


because u 2 = 1 = const. The other side of Newton’s equation yields 

K° F v/c 

0 = u- K — - 

yi - v 2 /c 2 yi _ v 2 /c 2 

so that if 0 = . F,v,/( ’ is related to the work done by the force on the particle. 

yi-vVc 2 

Now we turn to two-body collisions in which energy-momentum conserva¬ 
tion takes the form p\ + jb — p:>, + Pa, where p 1 - are the particle four-momenta. 
Because the scalar product of any four-vector with itself is an invariant under 
Lorentz transformations, it is convenient to define the Lorentz invariant energy 
squared s = (p\ + P2) 2 — P 2 , where P 1 ' is the total four-momentum, and use 
units where the velocity of light c = 1. The laboratory system (lab) is defined as 
the rest frame of the particle with four-momentum = (to 2 , 0 ) and the center 
of momentum frame (ems) by the total four-momentum P 1 ' = (£) + PJ 2 , 0). 
When the incident lab energy Ef is given, then 


s = pi + p\ + 2pi • p 2 = rxq + m\ + 2 m 2 E[ 


is determined. Now the ems energies of the four particles are obtained from 
scalar products 


Pi ■ P = Ei(Ei + E 2 ) = Eisfs 
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EXAMPLE 4.4.1 


so that 


E i = 


E 3 = 


Ex = 


Pi ■ 

Oi + 

P 2 ) 

TO; 

+ Pi ■ 

P 2 

m\ 

-mj 

+ s 


Vs 



Vs 



2Vs 

7 

P2 ■ 

(J>1 + 

P 2 ) 

m| 

+ Pi ■ 

P2 

mj 

— m\ 

+ s 


Vs 



Vs 



2 Vs 

7 

P3 ■ 

(P3 + 

Pi) 

TO 3 

+ f>3 • 

Pi 

m 2 

- to| 

+ s 


Vs 



Vs 



2Vs 

7 

Pi ■ 

(P3 + 

Pi) 

ml 

+ P3 

■ Pi 

ml 

to| 

+ s 


Vs 



Vs 



2Vs 


' P2 

= s — 

TOj ■ 


2^3 

■ Pi = 

■ s — 

TO 3 - 

■ ml 


E 4 = 

by substituting 


Thus, all cms energies E t depend only on the incident energy but not on the scat¬ 
tering angle. For elastic scattering to 3 = m 1; to 4 = to 2 so that E 3 = E 4 , E 4 = 
Eo. The Lorentz invariant momentum transfer squared 

t = (P\~ p 3 f = ml + mj- 2pi • p 3 
depends linearly on the cosine of the scattering angle. 


Kaon Decay and Pion Photoproduction Threshold Find the kinetic en¬ 
ergies of the muon of mass 106 MeV and massless neutrino into which a K 
meson of mass 494 MeV decays in its rest frame. 

Conservation of energy and momentum gives to# = E p + E v = Vs. Apply¬ 
ing the relativistic kinematics described previously yields 


E,= 

P,- 

■ (JPh + Pv) 

ml + p p - p v 


TO# 

TO# 

E v = 

Pv ■ 

(2Pn + Pv) 

Pf ■ Pv 


TO# 

TO# 


Combining both results, we obtain m 2 K = m“ + 2 p IL ■ p v so that 

YYi -f- 7Y& 

E ll = T ll +m 1l = -|-^ = 258.4 MeV, 

Ztuk 

9 9 

'yyi Lj _ 'W) a 

E V =T V = — -^ = 235.6 MeV. 

2 m# 

As another example, in the production of a neutral pion by an incident photon 
according to y + p -» 7 r° + 7 / at threshold, the neutral pion and proton are 
created at rest in the cms. Therefore, 

s = (j> Y + pf = m 2 p + 2 m p Ey = (j) n + p'f = Qny + m p f 
so that E 1 ; = to^ + = 144.7 MeV. ■ 
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SUMMARY 


The Lorentz group is the symmetry group of electrodynamics: It governs spe¬ 
cial relativity of the electroweak gauge theory and of the strong interactions 
described by quantum chromodynamics. The metric of Minkowski space-time 
is Lorentz invariant and expresses the propagation of light; that is, the veloc¬ 
ity of light is the same in all inertial frames. Newton’s equations of motion are 
straightforward to extend to special relativity. The kinematics of two-body col¬ 
lisions are important applications of vector algebra in Minkowski space-time. 


Biographical Data 

Lorentz, Hendrik Antoon. Lorentz, a Dutch physicist, was bom in 1853 
in Arnhem and died in 1928 in Haarlem. He obtained his Ph.D. in 1875 at Lei¬ 
den University, where he returned 3 years later as a professor of theoretical 
physics and stayed until his death. He refined Maxwell’s theory of radia¬ 
tion, and light in particular, attributing it to oscillations of charged particles 
within matter at a time when atoms were not universally recognized. These 
were later identified as electrons by J. J. Thomson and ions by Arrhenius. As 
proof, he suggested subjecting atoms to magnetic fields, predicting effects 
that were demonstrated by his student Zeeman in 1896. He also analyzed the 
physical consequences (Lorentz-Fitzgerald contraction) of the invariance of 
Maxwell’s equations under transformations that depend on the relative ve¬ 
locity of the inertial frames (now named after him) and differ from those 
(Galilean transformations) of Newton’s equations of motion. Thus, he was 
a forerunner of Einstein’s special relativity. 


EXERCISES 

4 . 4.1 Two Lorentz transformations are carried out in succession: v\ along the 
,'r-axis and then along the y-axis. Show that the resultant transfor¬ 
mation (given by the product of these two successive transformations) 
cannot be put in the form of a single Lorentz transformation. 

Note. The discrepancy corresponds to a rotation so that pure Lorentz 
transformations (boosts) do not form a group. 

4 . 4.2 Rederive the Lorentz transformation working entirely in Minkowski 
space ( x °, x 1 , x 2 , x' v ) with x° = Xo = ct. Show that the Lorentz transfor¬ 
mation may be written L(v) = exp (per), with 

( 0 — 1 —p — 

-10 0 0 

o’ = , 

—/x 0 0 0 

V -V 0 0 0y 

and 1, /x, v the direction cosines of the velocity v. 

4 . 4.3 Using the matrix relation, Eq. (4.67), let the rapidity p\ relate the Lorentz 
reference frames (.x'°, x n ) and (x°, .x 1 ). Let />> relate (.x"°, x" v ) and 
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Or' 0 , a;' 1 ). Finally, let p relate Or" 0 , x" y ) and Or 0 , x 1 ). From p = pi + p 2 , 
derive the Einstein velocity addition law 

id + v 2 
1 + viv 2 /c 2 

4.4.4 (a) A particle A decays into B and C. What is the energy of the particle B 
in the rest frame of A? An example is a neutral pion that decays into two 
photons, (b) Explain why a photon of sufficient energy can produce an 
electron-positron pair only in the presence of matter. 

Note. The particle masses obey m,\ > mg + me. 

4.4.5 Determine the minimum frequency of a y-ray that disintegrates a 
deuteron into a neutron of mass m n = 939.565 MeV/c 2 and a proton 
of mass m p = 938.272 MeV/c 2 . The deuteron has a binding energy of 
2.226 MeV. 

Hint. Ignore the Fermi motion; that is, take the neutron and proton at 
rest. Explain why this is a good approximation. 

4.4.6 An observer moving with four-velocity v!‘ measures the energy E of a 
particle with four-momentum jf. Show that E = cp ■ u. 

4.4.7 Derive the relativistic generalization of the equation of motion ^ = 
gE + qvx Bofa charged particle in an electromagnetic field. 

4.4.8 Show that the Lorentz covariant equation of motion ^ F ,, ' u p u 

of a charged particle in an electromagnetic field is consistent with (has 
as a solution) the on-mass-shell relation p 2 = m 2 c 2 . Here, mis the rest 
mass of the particle and r its proper time (i.e., the Lorentz invariant 
time measured in its rest frame), p 11 = meu 11 is its four-momentum, and 
F' lv = <) ll A v — d v A 1 ' is the electromagnetic field tensor. 


Additional Reading 


Buerger, M. J. (1956). Elementary Crystallography. Wiley, New York. A com¬ 
prehensive discussion of crystal symmetries. Buerger develops all 32 point 
groups and all 230 space groups. Related books by this author include 
Contemporary Crystallography. McGraw-Hill, New York (1970); Crystal 
Structure Analysis. Krieger, New York (1979); and Introduction to Crystal 
Geometry. Krieger, New York (1971, reprint 1977). 

Burns, G., and Glazer, A. M. (1978). Space Groups for Solid State Scientists. 
Academic Press, New York. A well-organized, readable treatment of groups 
and their application to the solid state. 

de-Shalit, A., and Talrni, I. (1963). Nuclear Shell Model. Academic Press, New 
York. We adopt the Condon-Shortley phase conventions of this text. 

Edmonds, A. R. (1957). Angular Momentum in Quantum Mechanics. Prince¬ 
ton Univ. Press, Princeton, NJ. 

Falicov, L. M. (1966). Group Theory and Its Physical Applications [Notes 
compiled by A. Luehnnann]. Univ. of Chicago Press, Chicago. Group theory 
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with an emphasis on applications to crystal symmetries and solid-state 
physics. 

Greiner, W., and Muller, B. (1989). Quantum Mechanics Symmetries. Springer, 
Berlin. We refer to this textbook for more details and numerous exercises 
that are worked out in detail. 

Hamermesh, M. (1962). Group Theo?'y and Its Application to Physical Prob¬ 
lems. Addison-Wesley, Reading, MA. A detailed, rigorous account of both 
finite and continuous groups. The 32 point groups are developed. The 
continuous groups are treated with Lie algebra included. A wealth of ap¬ 
plications to atomic and nuclear physics. 

Heitler, W. (1947). The Quantum Theory of Radiation, 2nd ed. Oxford Univ. 
Press, Oxford. Reprinted, Dover, New York (1983). 

Higman, B. (1955). Applied Group-Theoretic and Matrix Methods. Claren¬ 
don, Oxford. A complete and unusually intelligible development of matrix 
analysis and group theory. 

Messiah, A. (1961). Quantum Mechanics, Vol. 2. North-Holland, Amsterdam. 

Panofsky, W. K. H., and Phillips, M. (1962). Classical Electricity and Mag¬ 
netism, 2nd ed. Addison-Wesley, Reading, MA. The Lorentz covariance of 
Maxwell’s equations is developed for both vacuum and material media. 
Panofsky and Phillips use contravariant and covariant tensors. 

Park, D. (1968). Resource letter SP-1 on symmetry in physics. Am. J. Phys. 
36, 577-584. Includes a large selection of basic references on group the¬ 
ory and its applications to physics: atoms, molecules, nuclei, solids, and 
elementary particles. 

Ram, B. (1967). Physics of the SU(3) symmetry model. Am. J. Phys. 35, 16. 
An excellent discussion of the applications of SU(3) to the strongly inter¬ 
acting particles (baryons). For a sequel to this, see Young, R. D. (1973). 
Physics of the quark model. Am. J. Phys. 41, 472. 

Rose, M. E. (1957). Elementary Theory of Angular Momentum. Wiley, New 
York. Reprinted, Dover, New York (1995). As part of the development of 
the quantum theory of angular momentum, Rose includes a detailed and 
readable account of the rotation group. 

Wigner, E. P. (1959). Group Theory and Its Application to the Quantum Me¬ 
chanics of Atomic Spectra (translated by J. J. Griffin). Academic Press, 
New York. This is the classic reference on group theory for the physicist. 
The rotation group is treated in considerable detail. There is a wealth of 
applications to atomic physics. 




Infinite Series 


5.1 Fundamental Concepts 


Infinite series are summations of infinitely many real numbers. They occur 
frequently in both pure and applied mathematics as in accurate numerical 
approximations of constants such as V0.97 = (1 — 0.03) 1/2 = 1 — 0.03/2 + 
■ ■ ■ ~ 0.985, it, e, and periodic decimals by rational numbers. Elementary and 
transcendental functions may be defined by power series in a fundamental 
approach to the theory of functions in Section 5.6. In science and engineering 
infinite series are ubiquitous because they appear in the evaluation of integrals 
(Section 5.10), in the solution of differential equations (Sections 8.5 and 8 . 6 ), 
and as Fourier series (Chapter 14) and compete with integral representations 
for the description of a host of special functions (Chapters 11-13). 

Right at the start we face the problem of attaching meaning to the sum of 
an infinite number of terms. The usual approach is by partial sums. If we have 
an infinite sequence of terms U\, u%, 113, 'iM, u-„ ..., we define the nth partial 
sum as 


71 

Sn=^2,Ui- ( 5 - 1 ) 

i= 1 

This is a finite sum and offers no difficulties. 

Whenever the sequence of partial sums does not approach a finite limit, 
the infinite series is said to diverge. 


EXAMPLE 5.1.1 


Diverging Partial Sums The sum of positive integers 

n n 

(n+1) 

i= i ^ 
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EXAMPLE 5.1.2 


can be summed by pairing the first and last terms, whose sum is n + 1 , then 
the second and next to last term, whose sum is n + 1 as well, etc. Because 
there are n/ 2 pairs, we get the result. Here, we have tacitly taken n to be even. 
However, for odd n, the same argument can be made including zero as the first 
term. We can also use mathematical induction by verifying that the result is 
valid for n = 1 , giving 1 = |(1 + 1 ). The next step is to assume the result is 
true for n and then prove it for n + 1 as follows: 

Sn + (n + 1 ) = — (n + 1 ) + (n + 1 ) = ( — + 1 j (n + 1 ) = —-— (n + 2 ) = . 

Thus, it is valid for all natural numbers. Clearly, s„ —> oc as n -> oo, and the 
series diverges. Moreover, the partial sums are independent of the order in 
which we sum the terms. ■ 


If the partial sums s n approach a finite limit as n —*■ oo, 

lim s n = S, (5.2) 

fl—> OO 

the infinite series u n is defined to be convergent and to have the value 
S. Note that we reasonably, plausibly, but still arbitrarily define the infinite 
series as equal to S. Now consider one of the simplest convergent series. 


The Geometric Series The geometrical sequence, starting with a and with 
a ratio r (= u n+ \ /u„) independent of n, is given by 

a + ar + ar 2 + ar 3 + -b ar n ~ l H-. 


The nth partial sum is given by 1 


Sn = a- 


1 — v" 
1 — r 


Taking the limit asn->- oo, 


lim s n = - -, for \r\ < 1. 

n-^-oo 1 — T 


(5.3) 


(5.4) 


Hence, by definition, the infinite geometric series converges for |r| < 1 and 
has the value 


n=\ 


(5.5) 


However, for 


a rr 

\r\ > 1, lim |s„| = lim --- 

n-^-oo n-^-oo M — 7*1 


so that the series diverges. 


1 Multiply and divide s n = a Ylm~o r m by 1 — r, which gives (1 + r + • • • + r” 1 )(1 —r)= 1 + r + 
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This exact series summation can be used to determine periodic decimals 
in terms of rational numbers as follows. Let us consider two typical cases 


1.11 - - = 1 + HT 1 + 10 “ 2 


1 10 

1 - 10 - 1 ~ ~9~’ 


0.131313 -■■ 


13 13 

Too + To 1 + 

13 1 

100 1 - 10- 2 


1 *3 

••• = ^[ 1 + 10 " 2 + 10 " 4 + -'-] 

= * . 

99 


These examples suggest that a necessary condition for the convergence to 
a limit is that lim f) _ >oc u n = 0. This condition is not sufficient to guarantee 
convergence. However, for special series it can be considerably sharpened, as 
shown in the following theorem: 


If the u n > 0 are monotonically decreasing to zero, that is, u n > u n+ \ for 
all n, then J2 n u n is converging to S if, and only if, s n — nu n converges to 


S. 


As the partial sums s n converge to S, this theorem implies that nu„ -> 0, for 
n —> oo. 

To prove this theorem, we start by concluding from 0 < u n+ \ < u n and 


Sn +1 (u T 1 )U n +± — S n W M 1 — S n UU n 7l(u n U ,/; 1) > S n 7lU n 

that s n — nu n increases with n —> oo. As a consequence of s„ — nu n < s n < 
S, s n — nu n converges to a value s < S. Deleting the tail of positive terms 
Ui — u n from i = v + 1 to to, we infer from s n — nu n > m,o + (u,\ — Un) + ■ ■ ■ + 
(u v — Un) — s v — vu„ that s n — nu n > s v for n—> oc. Hence also s > S so that 
s = S and nu n —> 0. 

When this theorem is applied to the harmonic series J2- n „ w *li 1 n \ — 1> it 
implies that it does not converge; it diverges to +oo. This can also be verified 
as follows. 


EXAMPLE 5.1.3 


Regrouping of the Harmonic Series 




(5.6) 


We have the lim,, ^^ u n = lim„ _ >oc 1 /n — 0, but this is not sufficient to guar¬ 
antee convergence, as we have just shown. To see this independently of the 
theorem, we sum the terms as indicated by the parentheses 



1 

2 


(5.7) 
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Then each pair of parentheses encloses p terms for p = 2, 3,... of the form 

11 Ip 

' ' ' > 


p + l ' p + 2 ' p+p 2 p ^ ^ 

Forming partial sums by adding the parenthetical groups one by one, we obtain 


Si = 1, s 4 > 


82 = 2 ’ 55 > 2 ’"' 


6 
2 ’ 
n+ 1 


(5.9) 


53 > 2 ’ Sm> 2 ' 

The harmonic series summed in this way is certainly divergent. 2 


The condition for the validity of Eq. (5.2) is usually written in formal mathe¬ 
matical notation as follows: 

The condition for the existence of a limit S is that for each e > 0, there is a 
fixed N — N(e) such that 

| S— Sj| < e, fori > N. 

Note that the size of the first few or any fixed finite number of terms does not 
matter for convergence; it is the infinite tail that is decisive for conver¬ 
gence. This convergence condition is often derived from the Cauchy criterion 
applied to the partial sums s ( . The Cauchy criterion is as follows: 

A necessary and sufficient condition that a sequence (Sj) converges is that 
for each e > 0 there is a fixed number N such that 

| Sj — Sj | <s for all i, j > N. 

This means that the individual partial sums must cluster together as we 
move far out in the sequence, into the infinitely long tail of the series. 



Addition and Subtraction of Series 

If we have two convergent series T n u " ^ s anf l Tin v " their sum and 
difference will also converge to s ± S because their partial sums satisfy 


I sj ± Sj - (Si ± SOI = | sj - Si ± (Sj - SOI < I Sj - Si | + | Sj - Si | < 2s 

using the triangle inequality 


|a| - \b\ <\a + b\< |a| + |b| 


for a = Sj — Si, b = Sj — Si. 

A convergent series Tn u « & ma y he multiplied termwise by a real 

number a. The new series will converge to aS because 

I asj - asi| = ]a(sy - s0| = |o||Sj - s*| < |a|e. 


J The (finite) hannonic series appears in an interesting note on the maximum stable displacement 
of a stack of coins: Johnson, P. R. (1955). The leaning tower of Lire. Am. J. Phys. 23, 240. 
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This multiplication by a constant can be generalized to a multiplication by 
terms c n of a bounded sequence of numbers. 


If J2 n u » converges to S and 0 < c n < M are bounded, then J2 n u nC n is 
convergent. If J2 n u n is divergent and c n > M > 0, then J2 n u vPn diverges. 


To prove this theorem, we take i, j sufficiently large so that |sj — s, | < e. 
Then 


j J 

Y u n c n < M u n = M\Sj — Sj| < Ms. 

i +1 i +1 

The divergent case follows from 


Yu n Cn>MY u n —>■ oo. 

n n 


Multiplication of two convergent series will be addressed in Section 5.4. 
Our partial sums s n may not converge to a single limit but may oscillate, as 
demonstrated by the following example. 


EXAMPLE 5.1.4 


Oscillatory Series 

OO 

y ' u n =1 — 1 + 1 — 1 + 1 + ■ • ■ — (— 1 )” + ■ • •. 

n= 1 

Clearly, s„ — 1 for n odd but 0 for n even. There is no convergence to a limit, 
and series such as this one are labeled oscillatory and cannot be assigned a 
value. However, using the geometric series, 3 we may expand the function 

—-— = l- x + x 2 -x‘ ij l -b (-.r)™ -1 H-. (5.10) 

1 + x 

If we let x —»■ 1, this series becomes 


1 — 1 + 1 — 1 + 1 — 1H-, (5.11) 

the same series that we just labeled oscillatory. Although it does not converge 
in the usual sense, meaning can be attached to this series. Euler, for example, 
assigned a value of 1 /2 to this oscillatory sequence on the basis of the corre¬ 
spondence between this series and the well-defined function (1 + a;) -1 . Unfor¬ 
tunately, such correspondence between series and function is not unique and 
this approach must be refined (see Hardy, 1956, Additional Reading). Other 
methods of assigning a meaning to a divergent or oscillatory series, methods 
of defining a sum, have been developed. 

In general, however, this aspect of infinite series is of relatively little inte¬ 
rest to the scientist or the engineer. An exception to this statement, the 
very important asymptotic or semiconvergent series, is considered in 
Section 5.8. ■ 


3 Actually, Eq. (5.10) may be taken as an identity and verified by multiplying both sides by 1 + x, 
i.e., the geometric series of Eq. (5.5) with a = 1 and r = — x. 
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SUMMARY 


In summary, infinite series often appear in physics by means of Taylor expan¬ 
sions or binomial or other expansions. Truncating appropriate series is central 
to the art of approximation in physics. The geometric series is the simplest that 
can be summed exactly and illustrates convergence properties. 


EXERCISES 

5.1.1 Find the fractions that correspond to the following decimals: 

(a) 0.222- •• (c) 0.123123- •• 

(b) 0.0101 • • ■ (d) 0.45234523 

5.1.2 Show that 

oo 1 1 

y - 1 -= -. 

“i (2w — l)(2n + 1) 2 

Hint. Show (by mathematical induction) that s m — m/(2m+ 1). 

5.1.3 Show that 

OO i 

y -= i. 

<n + 1) 

Find the partial sum s m and verify its correctness by mathematical in¬ 
duction. 

Note. The method of expansion in partial fractions (Section 15.8) offers 
an alternative way of solving Exercises 5.1.2 and 5.1.3. 


5.2 Convergence Tests 


It is important to be able to tell whether a given series is convergent. We shall 
develop a few tests, starting with the simple and relatively insensitive tests 
and then the more complicated but quite sensitive integral tests. For now, let 
us consider a series of positive terms, a.„ > 0, postponing negative terms 
until the next section except when otherwise noted. 


Comparison Test 


We start with the theorem: If term by term a series of terms 0 < u n < a n , in 
which the a n form a convergent series, the series J2 n u n is also convergent. 

To prove it, assume that u n < a„ for all n; then Y^ n u n < a " anc * Un u n 
therefore is convergent. Then, if term by term a series of terms v n > b n , in 
which the b n fonn a divergent series, the series v n is also divergent. If 
v n > b n for all n, then v » — 1 Zn anc ^ v n therefore is divergent. 


D 


EXAMPLE 5.2.1 


Riemann Zeta Function Series If we sum the inverse squares of all natural 
numbers, 1 /to 2 , does the sum J2 n n ~ 2 converge? This sum is the special case 
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of the Riemann zeta function 

oo i 1 1 

£0*0 = ^2 ^c = ^ Jr ox Jr ^^ - (5.12) 

/__/ n x 2 X 3 X 

n =1 

for x — 2, which plays a prominent role in analytic prime number theory. The 
harmonic series is £ (x) for x —> 1 , where the function diverges as 1 /(x — 1 ). 
Its special values at integer arguments occur in statistical mechanics. To see 
that it converges for ,r; = 2 , we use the convergent series J2 n 1 /[n(n + 1 )] of 
Exercise 5.1.3 for comparison. Note that for n > 1 , clearly n+ 1 > to; hence, 
(n + l ) 2 > n(n+ 1) and, taking the inverse, (;n l\ f < n (n+\y Therefore, our 
series converges and from the first two terms and the comparison sum, we 
find 

5 1 1 

i <1+ 4 + 9 + "- <1 + 1 = 2 - 

Its actual value, 7 r 2 /6 ~ 1.6449 ■ • •, was found by Euler along with f (4), f( 6 ), • • • 
in terms of powers of tc and Bernoulli numbers B„ that are related to the finite 
sums of powers of natural numbers Y2Li *"• It takes approximately 200 terms 
in Eq. (5.12) to get the correct value of the second decimal place of £(2). ■ 

For the convergent comparison series a n we already have the geometric series, 
whereas the harmonic series will serve as the divergent series b n . As other 
series are identified as either convergent or divergent, they may be used for 
the known series in this comparison test. All tests developed in this section 
are essentially comparison tests. 


Cauchy Root Test 


If 0 < W 1/B < r < 1 for all sufficiently large n, with r independent of n, then 
a n is convergent. If (a n ) 1/n > 1 for all sufficiently large n, then J2 n a « i s 
divergent. The first part of this test is verified easily by raising (a„) 1/B < r to 
the nth power. We get 


< r® < 1. 


Since r n is the nth term in a convergent geometric series, J2 n a « is convergent 
by the comparison test. Conversely, if (a,,) 1 ^ > 1, then a„ > 1 and the series 
must diverge. This root test is particularly useful in establishing the properties 
of power series (Section 5.7). 


d’Alembert or Cauchy Ratio Test 


If 0 < a n+ 1 /« ,, < r < 1 for all sufficiently large n, and r is independent of n, 
then J2n a n is convergent. If a n+ \ /a n > 1 for all sufficiently large n, then a n 
is divergent. 

Convergence is proved by direct comparison with the geometric series 
(1 + r + r 2 + • ■ ■), for which r n+1 /r n = r < 1. In the second part, <i n +\ > a n and 
divergence is obvious since not even a n —> 0. Although not quite as sensitive 
as the Cauchy root test, the d’Alembert ratio test is one of the easiest to apply 
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and is widely used. An alternate statement of the ratio test is in the form of a 
limit: If 

.. 1 . _ 
lim - < 1, convergence, 

n-M3o a n 

> 1, divergence, (5.13) 

= 1, indeterminate. 

Because of this final indeterminate possibility, the ratio test is likely to fail at 
crucial points, and more delicate, sensitive tests are necessary. Let us illustrate 
how this indeterminacy arises. Actually, it is concealed in the first statement 
< r < 1. We might encounter a n+ i/a n < 1 for all finite n but be 
unable to choose an r < 1 and independent of n such that a n+ \/a n < r for 
all sufficiently large n. An example is provided by Example 5.2.2. 


EXAMPLE 5.2.2 


Harmonic Series 


because 


1 


n 

n+ 1 


< 1 , 


n= 1,2 ,... 


.. 0-n+1 . 

lim -= 1. 

n-^oo a„ 


No fixed ratio r < 1 exists and the ratio test fails. ■ 


(5.14) 


Cauchy or Maclaurin Integral Test 


This is the most common comparison test in which we compare a series with 
an integral. The test applies to J2 n a m where a n = f(ri) can be integrated in 
the (continuous) variable n and falls to zero monotonically. Geometrically, we 
compare the area of a series of unit-width rectangles with the area under a 
curve. 

Let f(pc) be a continuous, monotonically decreasing function in which 
f{n) = a n ■ Then J2 n a n converges if /“ / (x)dx is finite and diverges if the 
integral is infinite. For the ?'t h partial sum 


i i 

Si = J> n = ]T/(n). (5.15) 

71—1 71—1 

However, 

/ i+1 

f(x)dx (5.16) 

from Fig. 5.1a, with f(_x) being monotonically decreasing. On the other hand, 
from Fig. 5.1b, 

$i — ci] < f (x)dx, 


(5.17) 
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Figure 5.1 

(a) Comparison of 
Integral and Sum-blocks 
Leading, (b) Comparison 
of Integral and 
Sum-blocks Lagging 



in which the series is represented by the inscribed rectangles. Taking the limit 
as i —»■ oo, we have 


/»00 OO nOQ 

/ f(x)dx < V a n < / f(x)dx + cq . (5.18) 

J 1 n=l Jl 

Hence, the infinite series converges or diverges as the corresponding integral 
converges or diverges. This integral test is particularly useful in setting upper 
and lower bounds on the remainder of a series after some number of initial 
terms have been summed. That is, 


OO N OO 

X! a ™ = X are+ X a ™’ ( 5 - 19 ) 

71=1 77=1 7V=N+1 

where 

/»oo oo pOO 

/ f(x)dx < V' a n < / f(x)dx+ a,v + i. (5.20) 

Jn+1 ' ^v+1 •'JV+l 

To free the integral test from the restrictive requirement that the interpo¬ 
lating function f(pc) be positive and monotonic, we show for any function f(pc ) 
with a continuous derivative that 


„Nf r-Nf 

X / f(x)dx + / ( 

- l J Ni J Ni 


{x — [x^fXx^dx 


(5.21) 


n=Ni +1 


holds. Here, [a 1 ] denotes the largest integer below x so that x — [x] varies 
sawtooth-like between 0 and 1. To derive Eq. (5.21), we observe that 



xf'{x)dx = Nff(N f ) - Ni 



f{x)dx 


(5.22) 
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using integration by parts. Next we evaluate the integral 




Nf-l 

f'(x)dx = ^2 n{f(n+ 1) - f(n)} 

n^Ni 


= - E /W - Nif(Ni) + N f f(N f ). 


n=Ni+l 


(5.23) 


Subtracting Eq. (5.23) from Eq. (5.22), we arrive at Eq. (5.21). Note that f(x) 
may go up or down and even change sign so that Eq. (5.21) applies to alternating 
series as well (see Section 5.3). Usually, f'(x) falls faster than f(x) for x -> oo 
so that the remainder term in Eq. (5.21) converges better. It is easy to improve 
Eq. (5.21) replacing x — [x\ by x — [#] — which varies between — | and 

J2 = f f( x ) dx + f ( x ~ M - l)f 0*0^ + I f f( N f) ~ f(Ni)}. 

Ni<n<Nf J Ni JNi \ 2 

(5.24) 


This formula was discovered independently by Euler and Maclaurin. Note that, 
as a periodic function, x — [x] — \ = — has a Fourier expansion 

introduced in Chapter 14. The fix') integral becomes even smaller if fix) 
does not change sign too often. For an application of this integral test to an 
alternating series, see Example 5.3.2. 


EXAMPLE 5.2.3 


Riemann Zeta Function For the Riemann zeta function defined in Exam¬ 
ple 5.2.1 as 


c(p) = J2 nP > 

n= 1 


we may take f(n) — n p and then 



n p dn = 


n p+l 


OO 


-P+l i 


vi -1 


= lnn|“ 1; p= 1. 


(5.25) 


(5.26) 


The integral and therefore the series are divergent for p < 1 and convergent for 
p > 1. Hence, Eq. (5.25) should carry the condition p > 1. This, incidentally, 
is an independent proof that the harmonic series ( p = 1) diverges logarithmi¬ 
cally. The sum of the first million terms ^ 1 ' 000 000 M -i j s on [y 14.392 726_ 

This integral comparison may also be used to set an upper limit to the 
Euler-Mascheroni constant 4 defined by 

m” 1 — Inn). (5.27) 


y = lim ( 


\m =1 


4 This is the notation of the National Bureau of Standards (1972). Handbook of Mathematical 
Functions , Applied Mathematics Series 55 (AMS-55). Dover, New York. 
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SUMMARY 


Returning to partial sums, Eq. (5.17) yields 


Sn 


n 

y to -1 — Inn < 

m =1 



— lnre+ 1. 


(5.28) 


Evaluating the integral on the right, s n < 1 for all re and therefore y < 1. 
Exercise 5.2.8 leads to more restrictive bounds. Actually, the Euler-Mascheroni 
constant is 0.577 215 66 _ ■ 


The integral test is the workhorse for convergence. The ratio and root tests 
are based on comparison with the geometric series and simpler to apply, but 
they are far less sensitive and often indeterminate. 


EXERCISES 
5.2.1 (a) Show that if 


lim n p u n —* A < oo, p > 1, 

n-^-oo 

the series u„ converges. 

(b) Show that if 


lim nu n — A > 0, 

n-^-oo 

the series diverges. (The test fails for A = 0.) 

These two tests, known as limit tests, are often convenient for estab¬ 
lishing the convergence of a series. They may be treated as comparison 
tests, comparing with 

'yn~ q , 1 < q < P- 

n 

5.2.2 If 

.. b n „ 
lim — — K, 

n^oo a n 

a constant with 0 < K < oo, show that J2 n b n converges or diverges 
with a n . 


5.2.8 Test for convergence 

OO 

(a) ^(lnre)” 1 


n =2 


00 n I 


n=l 


Cc)£ 


—; 2re(2re+ 1) 


(d)£[n(w+l)r 


1/2 


n =1 


oo 

^ ^ 2 re+ 1 
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5.2.4 Test for convergence 


1 


^ S r ^ n + 

OO 1 

wE.v 


oo / 1 

(d) E> (i + ; 


72=1 


n=2 


nlnn 
1 


OO 1 

(e)E — 

“ n* n 

n =1 


n • n l / n * 




n=l 


5.2.5 For what values of p and q will the following series converge? 

E OO 1 

n =2 


=2 n^(lnn)^ * 

AMS'. Convergent for I 


P> 1, all g, 

divergent for { 

[p=l, q>l, 

5.2.6 Determine the largest and smallest values of x, that is, the range of 
convergence for Gauss’s hypergeometric series 


p < 1, all q, 
[ p = 1, q < 1. 


vr a i , + 1) 2 , 

F(a, p,y;x)=l+ j 7 ~X+ -——r: a: + • • ■. 

1!/ 2!y(y + 1) 

ANS. Convergent for — 1 < x < 1 and x = ±1 if y > a 


5.2.7 Set upper and lower bounds on J^^ 0,000 n 1 , assuming that 

(a) the Euler-Mascheroni constant [see Eq. (5.27)] is known. 

ANS. 14.392726 < ^M® 0 . 000 ^- 1 < 14.392 727. 

(b) The Euler-Mascheroni constant is unknown. 

5.2.8 Given E=i ° nr 1 = 7.485 470..., set upper and lower bounds on the 
Euler-Mascheroni constant [see Eq. (5.27)]. 


ANS. 0.5767 < y < 0.5778. 


5.2.9 (From Olbers’s paradox) Assume a static universe in which the stars 
are uniformly distributed. Divide all space into shells of constant thick¬ 
ness; the stars in any one shell by themselves subtend a solid angle 
of coo- Allowing for the blocking out of distant stars by nearer 
stars, show that the total net solid angle subtended by all stars, shells 
extending to infinity, is exactly 4jt . [Therefore, the night sky should 
be ablaze with light. For historical and other details, see Harrison, E. 
(1987). Darkness at Night: A Riddle of the Universe. Harvard Univ. 
Press, Cambridge, MA.] 


5.2.10 Test for convergence 

1 • 3 • 5 • • • (2n — l)] 2 
2 ■ 4 • 6 ■ ■ ■ (2 n) 



1 9 25 

4 + 64 + 256 + '"' 


Hint. Use Stirling’s asymptotic formula for the factorials. 
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5.2.11 The Legendre series, even u j( x )’ satisfies the recurrence relations 

, , U + 1)0 + 2 ) - 1(1 + 1 ) 2 , 2 
=- 0 + 2)0 + 3) - 1 


in which the index j is even and l is some constant (but, in this problem, 
not a nonnegative odd integer). Find the range of values of x for which 
this Legendre series is convergent. Test the end points carefully. 


ANS. — \<x<\. 


5.2.12 Show that the following series is convergent: 


( 2 s)!! ( 2 s + 1 ) 

Note. (2s - 1)!! = (2s - l)(2s - 3) - ■ - 3 • 1 with (-1)!! = 1: (2s)!! = 
(2s)(2s—2) ■ • • 4 • 2 with 0!! = 1. The series appears as a series expansion 
of sin (1) and equals 7 r/ 2 , and sin 1 ,x; = arcsin x (sinx) -1 . Use 
Stirling’s asymptotic formula for the factorials. 

5.2.13 Catalan’s constant [/8(2) of the National Bureau of Standards (1972). 
Handbook of Mathematical Functions, AMS-55, Chapter 23. Dover, 
New York] is defined by 

oo 111 

pw = XX-tfc 2 *+d -2 = 12 - p + 52 • • • • 

Calculate /6(2) to six-digit accuracy. 

Hint. The rate of convergence is enhanced by pairing the terms: 

(4 k - I ) -2 - (4 k + I )” 2 = I6k/(l6k 2 - l) 2 . 

If you have carried enough digits in your series summation, Xu</c<v 
16fc/(16A; 2 — l) 2 , additional significant figures may be obtained by setting 
upper and lower bounds on the tail of the series, N+ ,. These bounds 

may be set by comparison with integrals as in the Maclaurin integral 
test. 


E 


ANS. j6(2) = 0.9159 6559 4177 • ■ -. 


5.3 Alternating Series 


In Section 5.2, we limited ourselves to series of positive terms. Now, we con¬ 
sider infinite series in which the signs alternate. The partial cancellation due to 
changing signs makes convergence more rapid. We prove the Leibniz criterion, 
a fairly general condition for the convergence of an alternating series—that 
is, with regular sign changes. For series with irregular sign changes, such as 
Fourier series of Chapter 14 (see Example 5.3.3), the integral test of Eqs. (5.21) 
or (5.24) is often helpful. 
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Leibniz Criterion 


Alternating series such as J2„(~ W a n with positive a n > 0 involve summands 
that change sign at every term. Let us start with a typical example. 


EXAMPLE 5.3.1 


Alternating Harmonic Series We know from Example 5.1.3 and the na„ —► 
0 criterion that the harmonic series nr 1 diverges. However, the correspond¬ 
ing alternating series ^rr~ converges because we can cancel even against 
each subsequent odd term (without regrouping of the series) as follows: 


oo r i \n oo / i i \ oo 

y tiL = y(±- -J—) = y 

h n " \2y 2v + l/ “i 


2 < 2v + !) 

We see from the integral test that the last positive sum converges 


E 

v>v 0 


2 v(2v + 1) 


r°° dv _ r° 
Jm 2v(2v + 1) J w 

2v 


= - In 
2 2v 


1 

2u 


1 


2v + 1 


dv 


1 


1. 2ro 
= — In- 

2 2v>o + 


0, vo 


oo. 


More generally, consider the series ^^i(—l)" +1 «n with a n > 0. The Leibniz 
criterion states that if a n is monotonically decreasing (for sufficiently large 
n ) and lim,,^^, a n — 0, then the series converges. 

To prove this theorem, we examine the even partial sums 


S2n = Ul — 0-2 + a3 — • • • — d2.ni 
S2n+2 = S2n + («2r(+l — U2 m+2)- 

Since a 2n + 1 > « 2 n+ 2 , we have 


(5.29) 


$2n+2 > S2n- 


(5.30) 


On the other hand, 


S 2 n +2 = d\ - (o 2 - a 3 ) - (a 4 - a 6 )- a 2n+2 . (5.31) 

Hence, with each pair of terms a 2p — a 2p + 1 > 0, 

S2n+2 < ai. (5.32) 

With the even partial sums bounded s 2n < s 2n+2 < a\ and the terms a„ decreas¬ 
ing monotonically and approaching zero, this alternating series converges. 

One further important result can be extracted from the partial sums. From 
the difference between the series limit S and the partial sum s n 

S — S n = a n+ i — a n + 2 + ®m+3 — ®m+4 + • • ■ 

— ttm+1 (dn+2 dn+2) (fln +4 5 ) ' ' ' (5.33) 


or 


S' S n < (ln+ 1 ■ 


( 5 . 34 ) 
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Equation (5.34) states that the error in cutting off an alternating series 
after n terms is less than a n+ 1 , the first term dropped. A knowledge of 
the error obtained in this way may be of great practical importance. 


Given a series of terms u„ in which u n may vary in sign, if \ u n \ converges, 
then )T u n is said to be absolutely convergent. If u n converges but \u n \ 
diverges, the convergence is called conditional. In contrast to absolutely 
converging series whose terms can be summed in any order, conditionally 
convergent series must be manipulated more carefully. Its terms must be 
summed in the prescribed order. 

The alternating harmonic series is a simple example of this conditional 
convergence, whereas the harmonic series has been shown to be divergent in 
Sections 5.1 and 5.2. 


Absolute and Conditional Convergence 


EXAMPLE 5.3.2 


Rearrangement of Harmonic Series Let us continue with the alternating 
harmonic series to illustrate that conditionally converging alternating series 
cannot be rearranged arbitrarily without changing their limit or disrupting their 
convergence and hence must be handled carefully. If we write 



it is clear that the sum 

OO 

^(-lf-'rT 1 < 1. (5.35) 

n= 1 


However, if we rearrange the terms, we may make the alternating harmonic 
series converge to |. We regroup the terms of Eq. (5.35), taking 



Of course, this rearrangement of terms does not omit any term but system¬ 
atically postpones negative terms changing partial sums. Treating the terms 
grouped in parentheses as single terms for convenience, we obtain the partial 
sums 


si = 1.5333 
s 3 = 1.5218 
s 5 = 1.5143 
s 7 = 1.5103 
s 9 = 1.5078 


s 2 = 1.0333 
s 4 = 1.2718 
sq = 1.3476 
s 8 = 1.3853 
sio = 1.4078. 


From this tabulation of s n and the plot of s n versus n in Fig. 5.2 the convergence 
to | is fairly clear. We have rearranged the terms, taking positive terms until the 
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Figure 5.2 

Alternating 
Harmonic 
Series—Terms 
Rearranged to Give 
Convergence to 1.5 



partial sum was equal to or greater than I, then adding in negative terms until 

q ^ 

the partial sum just fell below and so on. As the series extends to infinity, all 
original terms will eventually appear, but the partial sums of this rearranged 
alternating harmonic series converge to |. 

By a suitable rearrangement of terms, a conditionally convergent series may 
be made to converge to any desired value or even to diverge. This statement is 
sometimes called Riemann’s theorem. Obviously, conditionally convergent 
series must be treated with caution. ■ 


Note that most tests developed in Section 5.2 assume a series of positive 
terms and therefore guarantee absolute convergence. 


EXAMPLE 5.3.3 


Convergence of a Fourier Series For 0 < x < 2: r, the Fourier series (see 
Section 14.1 for an introduction) 


OO 


E 

n= 1 


cos (nx) 
n 


= - In 



(5.36) 


converges, having coefficients that change sign often but not at every term 
(except for x — jr) so that Leibniz’s convergence criterion cannot be used. 
Let us apply the integral test of Eq. (5.21). Using integration by parts we see 
immediately that 

sin(rw;)l 00 1 /-“sin (nx) 

- + - / - „—dn (5.37) 

nx j x Ji n z 


I 


cos (nx) 


dn = 


n 
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SUMMARY 


converges, and the integral on the right-hand side even converges absolutely. 
The derivative term in Eq. (5.21) has the form 



- M) 


— sin(rtr) — 
n 


cos (nx) 
n 2 


dn, 


(5.38) 


where the second term converges absolutely and need not be considered fur¬ 
ther. Next, we observe that (/(AO = /j V (n — [n] ) sin (nx)dn is bounded for 
N —»■ oo, just as f N sm(nx)dn is bounded because of the periodic nature of 
sin(n.r) and its regular sign changes. Using integration by parts again, 


r g'(n) 

/ - dn 

J i n 


9(n) 

n 


oo 

n= 1 



ffOO 

n 2 


dn, 


(5.39) 


we see that the second term is absolutely convergent, and the first goes to zero 
at the upper limit. Hence, the series in Eq. (5.36) converges, which is difficult 
to see from other convergence tests. ■ 


There are many conditionally convergent series with sign changes so irreg¬ 
ular that it is sometimes extremely difficult to determine whether or not they 
converge. A prominent example is the inverse of the Riemann zeta function 
(see Example 5.2.1) f (s) for s —>- 1, which can be defined by a product over all 
prime numbers p as 



where the exponent s = 1 + e for e —> 0. Multiplying individual terms of the 
product yields the series with p(l) = 1, /i(n) = (—l) m if the integer 

n= pip -2 - ■ ■ Pm is the product of m different primes pi and p(n) = 0 else; that 
is, 


1 _ i 1 1 1 

£00~ ~2S~3~ S ~5* + fb + '"' 

Proving that 1-t — g — g + §H-converges to 0 is known to be equivalent 

to showing that there are |n ' prime numbers p < x. This asymptotic relation 
is known as the prime number theorem, which was conjectured first by Gauss 
and proved by two French mathematicians, J. Hadamard and Ch. de la Vallee- 
Poussin, in 1898 using the analytic properties of the zeta function as a function 
of the complex variable s. 


Sign changes are essential for the convergence of conditionally convergent 
series. Alternating series have regular sign changes with each term so that 
convergence can be easily tested by Leibniz’s criterion. The integral test is 
sometimes applicable. Deciding the convergence of series with irregular sign 
changes can be extremely difficult. No general tests exist. 
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EXERCISES 


5.3.1 (a) From the electrostatic two-hemisphere problem (Exercises 14.3.6, 
14.3.7) we obtain the series 


£(-l) s (4s + 3) 

s =0 


(2s - 1)!! 
(2s+ 2)!!' 


Test it for convergence. 

(b) The corresponding series for the surface charge density is 


£(-1) s (4s + 3) 

s =0 


(2s - 1)!! 
(2s)!! 


Test it for convergence. The n \! notation is defined in Exercise 5.2.12. 
Hint. Use Stirling’s asymptotic formula for the factorials. 


5.3.2 Show by direct numerical computation that the sum of the first 10 terms 
of 


lim 


i ln(l + x) = In 2 = 


n= 1 


differs from In 2 by less than the 11th term: In 2 = 0.69314 71806 ■ 


5.3.3 hi Exercise 5.2.6, the hypergeometric series is shown convergent for 
x = ±1, if y > u + ft. Show that there is conditional convergence for 
x = — 1 for y down toy > a + ft — 1 . 

Hint. The asymptotic behavior of the factorial function is given by 
Stirling’s formula. 


5.3.4 If the u n decrease monotonically to zero with n —> oo, show that 
YlnUnQ— l) [ra/31 converges, where [to/ 3] is the largest integer below to/3. 


5.4 Algebra of Series 


The establishment of absolute convergence is important because it can be 
proved that absolutely convergent series can be reordered according to the 
familiar rules of algebra or arithmetic: 

• If an infinite series is absolutely convergent, the series sum is independent 
of the order in which the terms are added. 

• The series may be multiplied with another absolutely convergent series. 
The limit of the product will be the product of the individual series limits. 
The product series, a double series, will also converge absolutely. 

No such guarantees can be given for conditionally convergent series, as we 
have seen in Example 5.3.2. 
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Multiplication of Series 

The standard or Cauchy product of two series 

y ] u n ■ y ' v n = y ' c n , c n = uov n +u\v n -\ + • • •+ u n v o, 


(5.40) 


which can be written as a double sum J2 n H",i=o u m v n - m , where the sum of the 
indices is m+ (n — m) = n for the nth term c n of the product series. 

Absolutely convergent series can be multiplied without problems. This fol¬ 
lows as a special case from the rearrangement of double series. However, con¬ 
ditionally convergent series cannot always be multiplied to yield convergent 
series, as the following example shows. 


EXAMPLE 5.4.1 


Square of a Conditionally Convergent Series May Diverge The series 
2Z^=i converges by the Leibniz criterion. Its square 


Y (-i r~ l 

n 




' 1 1 

_Vl \/n— 1 


1 1 

V2 ~Jn— 2 


1 1 " 

~Jn — 1 VI. 


has the general term in brackets consisting of n — 1 additive terms, each of 

which is > — 1 — =, so that the product term in brackets is > rj -\ and does 

Vra-Wra-l w-1 

not go to zero. Hence, this product oscillates and therefore diverges. ■ 


Hence, for a product of two series to converge we have to demand as a 
sufficient condition that at least one of them converges absolutely. To prove 
this product convergence theorem that if J2 n u n converges absolutely to U, 
J2 n i 'n converges to V, then 

n 

y ' (‘>ij Cn = y ^ ^ mVn—m 
n ?jj=0 

converges to UV, it is sufficient to show that the difference terms D„ = c () + 

Ci H-h On, — U n V n ->■ 0 for n ^ oc, where U n , V„ are the partial sums of our 

series. As a result, the partial sum differences 

Dn — UqVq + (U 0 V 1 + UiVo) + ■ ■ ■ + (Uo^2n + + • • • + U 2 n V o) 

— (Uq + U\ + • • • + U n )(v o + Vi + • • • + V n ) 

— Uo(v n+ i + ■ ■ ■ + V2n) + Ul(y n+ 1 + ■ • • + V2n-l) + ‘ ‘ ‘ + U n+ iV n+ \ 

+ V n+ i(l!o + • ' ■ + r m _i) + ■ • • + U2nVo 

so that for all sufficiently large n 

\D n \ < e(|Mo| + • • • + + MQu n+1 \ + • • • + \U 2 n\) < c{a + M) 

because | v n+ \ + v n+ 2 + ■ ■ ■ + v n+m \ < s for sufficiently large n and all positive 
integers m as J2 v n converges, and the partial sums V n < B of J2n v » are 
bounded by M because the sum converges. Finally, we call y) n \u n \ — a, as 
u n converges absolutely. 
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SUMMARY 


Two series can be multiplied provided one of them converges absolutely. Ad¬ 
dition and subtraction of series are also valid termwise if one series converges 
absolutely. 


EXERCISE 


5 . 4.1 Given the series (derived in Section 5.6) 


9 9 A 

ry'C-‘ ry r -> 

JU tAJ 

ln(l + aO = ar- y + y - y 


show that 
In 


■ x 


1 — X 


rjQ 3 

= 2[x+- + - 


-1 < x < 1, 


-1 < X < 1 . 


The original series, ln(l + x), appears in an analysis of binding energy 
in crystals. It is | the Madelung constant (2 In 2) for a chain of atoms. 
The second series is useful in normalizing the Legendre polynomials 
(Section 11.3). 



We extend our concept of infinite series to include the possibility that each 
term u„ may be a function of some variable, u„ = u n (x). Numerous illustrations 
of such series of functions appear in Chapters 11-13. The partial sums become 
functions of the variable x, 

S n {X ) = Uiix) + U 2 [X ) H-h Unix), (5.41) 

as does the series sum, defined as the limit of the partial sums 

OO 

E Unix) = Six) = lim Snix). (5.42) 

n-^-oo 

n= 1 

So far we have concerned ourselves with the behavior of the partial sums as 
a function of n. Now we consider how the foregoing quantities depend on x. 
The key concept here is that of uniform convergence. 


The concept of uniform convergence is related to a rate of convergence of a 
series of functions in an interval fa, 6] of the variable x that is independent 
of the location of x in that interval. Given uniform convergence, a series of 
continuous functions converges to a continuous function, limiting values of 
the sum exist at any point x of the interval, and even termwise differentiation 
of a series is valid under suitable conditions. Note that the interval must be 
closed, that is, include both end points. 


Uniform Convergence 
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Figure 5.3 

Uniform 

Convergence 



If for any small e > 0 there exists a number N, independent of x in the 
interval [a, b] (i.e., a < x < 6), such that 

|S(ar) — s n (a;)| < s, for all n> N, (5.43) 

the series is defined to be uniformly convergent in the interval [a, b]. This 
states that for our series to be uniformly convergent, it must be possible to 
find a finite N so that the tail of the infinite series, | J2h=N+i M i0*0I, will be less 
than an arbitrarily small s for all x in the given interval. 

This condition, Eq. (5.43), which defines uniform convergence, is illustrated 
in Fig. 5.3. The point is that no matter how small e is taken to be, we can 
always choose n large enough so that the absolute magnitude of the difference 
between ,S(.X') and s n (x) is less than s for all x, a < x < b. If this cannot be 
done, then u n (x;) is not uniformly convergent in [a, b]. 


EXAMPLE 5.5.1 


Nonuniform Convergence 


oo oo 

J2 U n(x) = ^2 

n =1 n =1 


X 

[(n — Y)x + 1] [nx + 1] ’ 


(5.44) 


A formula for the partial sum, s r ,(x) = nx(nx + l) -1 , may be verified by 
mathematical induction. By inspection, this expression for s n (x ) holds for 
n = 1, 2. We assume it holds for n tenns and then prove it holds for n + 1 
terms: 


x 

Sn+liPC) = S n (x)+ - - 

[nx+ 1 ][{n+ Y)x+ 1] 
nx x 

[nx+ 1] [nx+ l][(?x+ l)x+ 1] 
(n + Y)x 
(n + l)x + 1 ’ 


(5.45) 


completing the proof. 
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Letting n approach infinity, we obtain 


S(0) = lim SnC 0) = 0, 

S(x ^ 0) = lim s n (x ^ 0) = 1. 

n—^oo 

We have a discontinuity in our series limit at x = 0. However, s n (x) is a con¬ 
tinuous function of x, for 0 < x < 1, for all finite n. Equation (5.43) with s 
sufficiently small will be violated for all finite n. Our series does not converge 
uniformly. ■ 


Weierstrass M (Majorant) Test 


The most commonly encountered test for uniform convergence is the Weier¬ 
strass M test. If we can construct a series of numbers Mi, in which 
M{ > Uj(x) for all x in the interval [a, b] and ^i° M, is convergent, our 
series U;(x) will be uniformly convergent in [a, b]. 

The proof of this Weierstrass M test is direct and simple. Since J2i Mi 
converges, some number N exists such that for n+ 1 > N, 


OO 

Mi < e. 

i=n +1 


(5.46) 


This follows from our definition of convergence. Then, with \ii;(x)\ < Mi for 
all x in the interval a < x < b, 


OO 

\Ui(x)\ < s. 

i=n +1 


(5.47) 


Hence, 


|S(a;) - s n Qx) | 


OO 

Y “iW 

i=n +1 


< e, 


(5.48) 


and by definition Ui{x) is uniformly convergent in fa, h\. Since we have 

specified absolute values in the statement of the Weierstrass M test, the series 
i UiCx ) is also seen to be absolutely convergent. Note that uniform conver¬ 
gence and absolute convergence are independent properties. Neither implies 
the other. For specific examples, 


OO 


E 

n =1 


(- 1 )" 
n+ x 2 ’ 


— OO < X < OO 


(5.49) 


and 


Ec - 1 )”- 1 


x" 


n= 1 


ln(l + x), 0 < x < 1 


n 


(5.50) 
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| Abel’s Test 


converge uniformly in the indicated intervals but do not converge absolutely. 
On the other hand, 

OO 

Y,(l — x)x n = 1 , 0 < x < 1 

n =0 

= 0, x=l (5.51) 

converges absolutely but does not converge uniformly in [0, 1]. 

From the definition of uniform convergence we may show that any series 

OO 

fix) = Y u n(x) (5.52) 

n= 1 

cannot converge uniformly in any interval that includes a discontinuity of / (x). 

Since the Weierstrass M test establishes both uniform and absolute con¬ 
vergence, it will necessarily fail for series that are uniformly but conditionally 
convergent. 


A more delicate test for uniform convergence has been given by Abel. If 

U n {x) = a n f n (X ), 

Y, a n = A, convergent 


and the functions ./„(.%') are monotonic [/„+ fx) < f n (x)] and bounded, 0 < 
fn(x) < M, for all x in [a, 6], then J2 n u nix) converges uniformly in [a, b]. 
This test is especially useful in analyzing power series (compare 
Section 5.7). Details of the proof of Abel’s test and other tests for uniform 
convergence are given in the Additional Reading section at the end of this 
chapter. 

Uniformly convergent series have three particularly useful properties: 


• If the individual terms u n (x ) are continuous, the series sum 


OO 

f(x) = y 

n =1 


(5.53) 


is also continuous. 

• If the individual terms u n (x) are continuous, the series may be integrated 
term by term. The sum of the integrals is equal to the integral of the sum: 


pb OO n t 

/ f(x)dx =Y 
J a n=l^ a 


u n (jx)dx. 


(5.54) 


• The derivative of the series sum f(x) equals the sum of the individual term 
derivatives, 


d 

dx 


Kx) = y 

n =1 


d 

dx 


Unix), 


(5.55) 
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SUMMARY 


provided the following conditions are satisfied: 

u n (x) and are continuous in [a, b\. 

dx 


OO 


E 

n= 1 


du n (x ) 
dx 


is uniformly convergent in [a, b]. 


Term-by-term integration of a uniformly convergent series 5 requires only conti¬ 
nuity of the individual terms. This condition is almost always satisfied in phys¬ 
ical applications. Term-by-term differentiation of a series is often not valid 
because more restrictive conditions must be satisfied. Indeed, we shall en¬ 
counter Fourier series in Chapter 14 in which term-by-term differentiation of 
a uniformly convergent series leads to a divergent series. Extracting values of 
series of functions for limiting values of the argument also requires uniform 
convergence. 


Biographical Data 

Abel, Niels Hendrik. Abel, a Norwegian mathematician, was born in 1802 
on Finnoy Island and died in 1829 in Froland. Son of an alcoholic pastor, 
he lived in poverty and suffered professionally since he lived in Norway, 
far from the scientific mainstream in France and Germany. A teacher at the 
University of Christiania (now Oslo), where Abel studied, recognized his 
talents and supported him financially. In 1824, he showed the impossibility 
of solving algebraically for the roots of polynomials of 5th order. He sent 
his proof to Gauss, who, surprisingly, did not recognize his advance. He 
found the definitive form of the binomial expansion theorem and solved an 
integral equation now named after him. Recognition came so late (1829) that 
an appointment to a professorial position in Berlin arrived 2 days after his 
death at the age of 26. 


EXERCISES 

5 . 5.1 Find the range of uniform convergence of the Dirichlet series 


(a)E 


(- 1 ) 


n— 1 


i 


n 


(b) ax) = E —• 

n x 


n =1 n =1 

ANS. (a) 1 < x < oo. (b) 1 < s < x < oo, with s arbitrary but fixed. 

5 . 5.2 For what range of x is the geometric series Y^n=o x " uniformly conver¬ 
gent? 


ANS. 


-1 < — S < X < s < 1. 


5 . 5.3 For what range of positive values of x is 1/(1 + x n ) 
(a) convergent? (b) uniformly convergent? 


5 Term-by-term integration may also be valid in the absence of uniform convergence. 











5.6 Taylor’s Expansion 


281 


5.5.4 If the series ofthe coefficients a„ and ^ b n are absolutely convergent, 

show that the Fourier series 

n cos nx + b n sin nx) 

is uniformly convergent for — oo < x < oo. 


5.6 Taylor’s Expansion 


This is an expansion of a function into an infinite series or into a finite series 
plus a remainder term. The coefficients of the successive terms of the series in¬ 
volve the successive derivatives of the function. We have already used Taylor’s 
expansion in the establishment of a physical interpretation of divergence 
(Section 1.6) and in other sections of Chapters 1 and 2. Now we derive the 
Taylor expansion. 

We assume that our function / (x) has a continuous nth derivative 6 in the 
interval a < x <b. Then, integrating this nth derivative n times 

f X f (n \xOdx, = i)|* = f in - l) (x) - f (n - l Ha), 

J a 

poc px 2 px 

/ dx 2 dxif [n) (x{) = / dx 2 [f {n ~ v, (x2) - / ( " _ 1 ) (a)] (5.56) 

J a J a J a 

= / ( " -2) 0r) - / ( -™ -2 - ) (a) - (pc - a)/ (,i-1) (a). 


Continuing, we obtain 

pX pX 3 pX 2 

/ dx 3 / dx 2 / dxif^(xi) = f (n -' s \x) - / (M_ 3 ) (a) - {pc- a)/ (n- 2 ) (a) 

J a J a J a 

(5.57) 

Finally, on integrating for the nth time, 

r x r x 2 r x _ a \i 

/ dxn■■■ dxif {7t) (xi) = fix') - fia) -Qx- a)fia) - J /"(a) 

- ->•. ' (5,) 

(n— 1 )! 

Note that this expression is exact. No terms have been dropped and no ap¬ 
proximations made. Now, solving for fix), we have 


fix) = fia) + ix- a) fia) 

+ ...++fin. (5.59) 
2! (n— 1)! 


6 Taylor’s expansion may be derived under slightly less restrictive conditions; compare Jeffreys, 
II., and Jeffreys, B. S. (1956). Methods of Mathematical Physics, 3rded., Section 1.133. Cambridge 
Univ. Press, Cambridge, UK. 
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In a first reading of this section, the remainder, R„, can be ignored and the 
reader may continue with the infinite Taylor expansion series of Eq. (5.59). 
The remainder, R n , is given by the n-fold integral 

Rn= I dx n ■ ■ ■ I dx A f (n \x A ). (5.60) 

J a J a 

This remainder may be put into a perhaps more practical form by using the 
mean value theorem of integral calculus: 


f 


g(x)dx — (x-a)g(£), 


(5.61) 


with a < £ < x. By integrating n times we get the Lagrangian form' of the 
remainder: 

Rn = {X ~f r f (n \H\ (5-62) 

n\ 

With Taylor’s expansion in this fonn, we are not concerned with any ques¬ 
tions of infinite series convergence. This series is finite, and the only question 
concerns the magnitude of the remainder. 

When the function f(x) is such that 


lim R„ = 0, (5.63) 

n—x-oo 

Eq. (5.59) becomes Taylor’s series 8 
fix) = 


Our Taylor series specifies the value of a function at one point, x, in terms of 
the value of the function and its derivatives at a reference point, a. It is an 
expansion in powers of the change in the variable, Ax = x — a in this case. 
The notation may be varied at the user’s convenience. With the substitution 
x—>x + h and a x, we have an alternate fonn 

OO i,n 

fix + h) = —j -f (n \x). 


f(a ) + ix - a)f(a) + ol f"(a) 


2 ! 


n =0 


(5.64) 


Biographical Data 

Taylor, Brooke. Taylor, an English mathematician, was bom in 1685 and 
died in 1731. He was educated in Cambridge and developed his famous 
theorem in the context of work on methods of finite differences (published 


'An alternate form derived by Cauchy is 

Rn = 


Or-fT* \x-d) e(li) 




with a < t; < x. 

8 Note that 0! = 1 (compare Section 10.1). 


(n- 1 )! 
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in 1715). He also discovered the method of integration by parts. The conver¬ 
gence of his series was not considered until a century later and then proved 
by Cauchy under suitable conditions. The theorem remained obscure until 
Lagrange recognized its importance and named it after Taylor. 


Maclaurin Theorem 

If we expand about the origin (a — 0), Eq. (5.64) is known as Maclaurin’s series 


fix) = m + Xfi 0) + X -f"i 0) + ■■• = £ °^f n \ 0). 


n =0 


X 

n\ 


(5.65) 


An immediate application of the Maclaurin series (or the Taylor series) is 
in the expansion of various transcendental functions into infinite series. 


EXAMPLE 5.6.1 


Exponential Function Let f{pc) = e x . Differentiating, we have 

/ W (0) = 1 


(5.66) 


for all n, n = 1, 2, 3,_Then, by Eq. (5.65), we have 


9 9 

rytJ ry‘ r -> 

™ . JU *AJ 

* = 1 +X+ 2 ! + 3! 


^ rylt' 

n n - 
n—0 


(5.67) 


This is the series expansion of the exponential function. Some authors use this 
series to define the exponential function. The ratio test applied to this series, 
x n+l n\/\x n (n + 1)!] = x/(n + 1) -> 0 as n -> oo, shows that it converges for 
any finite x. In a first reading, therefore, the reader may skip the analysis of 
the remainder and proceed to Example 5.6.2. 

Although this series is clearly convergent for all x, we should check the 
remainder term, R n , By Eq. (5.62) we have 

T n T n 

Rn = -w/%) = -jet, 0<ltn<x. (5.68) 

n\ n\ 

Therefore, 

ryYlpX 

\Rn\ < —r (5.69) 

m 

and 


lirn R n = 0 (5.70) 

n— hx> 

for all finite values of x, which indicates that this Maclaurin expansion of e x 
converges absolutely over the range — oo < x < oo. 

Now we use the operator 3 = d/dx in the Taylor expansion, which 
becomes 


OO h U rl n 

fix+h) = —j-ZO) = e hd fix), 

to n[ 
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As the argument x is shifted to x+h, the operator 3 effects a translation in a 
differentiable function. An equivalent operator form of this Taylor expansion 
appears in Exercise 4.2.3. A derivation of the Taylor expansion in the context 
of complex variable theory appears in Section 6.5. ■ 


EXAMPLE 5.6.2 


Logarithm Let f(x) = ln(l + x). By differentiating, we obtain 
fix') = (1 + x)~ l , 


f n \x) = (-1 ) n ~\n- 1)!(1 + x)~ n . 


(5.71) 


The Maclaurin expansion [Eq. (5.65)] yields 

934 

ry’J 

tv tAs tAj 

ln(l + af) = a;-y + y- — H-H R n 

= + R n . (5.72) 

U p 

Again, in a first reading, the remainder may be ignored. We find that the infinite 
series 

00 x n 

ln(l + x)= yc-iy- 1 — (5.73) 

f n 

n=l 

converges for —1 < x < 1. The range —1 < x < 1 is established by the 
d’Alembert ratio test (Section 5.2). Convergence at x — 1 follows by the Leibniz 
criterion (Section 5.3). In particular, atx— 1, we have 

1111 00 

In 2 = 1—- + - + - — — • = £(-i; r l n\ (5.74) 

n =1 

the conditionally convergent alternating harmonic series. 

Returning to the remainder, in this case it is given by 

R n = 0<t;<x 

n\ 

x n 

< —, 0<|<»<1. (5.75) 

n 

Now the remainder approaches zero as n is increased indefinitely, provided 

0 < a; < l. 9 ■ 



Binomial Theorem 

A second, extremely important application of the Taylor and Maclaurin expan¬ 
sions is the derivation of the binomial theorem for negative and/or nonintegral 
powers. 


B This range can easily be extended to — 1 < x < 1 but not to x = — 1. 
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Let /(a;) = (1 + x')"‘, where mmay be negative and is not limited to integral 
values. Direct application of Eq. (5.65) gives 


, m(m— 1) 2 

(1 + x) — 1 + mx H-—- x ■ 


Hn ■ 


(5.76) 


Ignoring the remainder at first, the binomial expansion leads to the infinite 
series 

„ vm , m(m- 1) , m(m— l)(m-2) ~ 

(1 + x) = 1 + mx H-—— -x 2 H-—--ar H-. (5.77) 


2 ! 


3! 


In other, equivalent notation 


(1 +x) m =J2 


to ! 


TO ' 


^ to!(to — to)! 


t a; 


n=0 


~'£, n 

n =0 \ 


X . 


(5.78) 


The quantity (™), which equals to!/[to!(to — to)!], is called a binomial coeffi¬ 
cient. Using the ratio and Leibniz tests, the series in Eq. (5.77) may actually be 
shown to be convergent for the extended range — 1 < x < 1. For to an integer, 
(to—to)! = ±oo if to > TO(Section 10.1), and the series automatically terminates 
at n — m. 

We now return to the remainder for this function 

R n = (1 + £) m— ™ to(to - 1) ■ ■ • (to - to + 1), (5.79) 

to ! 


where § lies between 0 and x, 0 < § < x. Now, for to > to, (1 + £) m n is a 
maximum for f = 0. Therefore, 

Xj^ 

R n < —to(to— 1) ■ • ■ (to — to+ 1). (5.80) 

to ! 

Note that the m-dependent factors do not yield a zero unless r« is a nonnegative 
integer; R n tends to zero as to —> oc if x is restricted to the range 0 < x < 1. 
Although we have only shown that the remainder vanishes, 


lim R n — 0, 

n-^-oo 


for 0 < x < 1, the argument can be extended to — 1 < x < 0. 

For polynomials we can generalize the binomial expansion to 

to ! 

'h 


(«i + a 2 H-h a m ) re = ^ 


TOi!TO 2 ! • • - to- 


-Ca7 


■ a;, 


where the summation includes all different combinations of n\, n 2 ,..., n m 
with Y^a=i n i = n - Here, To; and to are all integral. This generalization finds 
considerable use in statistical mechanics. 

Maclaurin series may sometimes appear indirectly rather than by direct 
use of Eq. (5.65). For instance, the most convenient way to obtain the series 
expansion 


OO 

skU 1 x = ^ 

n =0 


( 2 to — 1 )!! 

( 2 to )!! 


(2 TO+ 1) 


6 


3x b 

40" 


(5.81) 
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is to make use of the relation (from sin y—x, get dy/dx = 1 /Vl — x 1 ) 

. r dt 

sm x ~ L (i-t 2 )'/ 2 ' 

We expand (1—£ 2 ) -1/2 (binomial theorem) and then integrate term by term. This 
term-by-term integration is discussed in Section 5.7. The result is Eq. (5.81). 
Finally, we may take the limit as x —> 1. The series converges by the integral 
test. 


Biographical Data 

Maclaurin, Colin. Maclaurin, a Scottish mathematician, was born in 1698 
in Kilmodan, County of Argyle, and died in 1746 in Edinburgh, Scotland. 
He entered the University of Glasgow at age 11 and obtained a degree in 
mathematics in 1715. In 1717, when he was 18, he was appointed professor 
of mathematics, enjoying Newton’s recommendation. He extended and so¬ 
lidified Newton’s calculus and was probably the greatest mathematician of 
the generation following Newton. 


Taylor Expansion—More Than One Variable 


If the function / has more than one independent variable, for example, / = 
fix, y), the Taylor expansion becomes 

fix, y) = f(a, ft) + Or - a)^f- + iy-b~yf 

dx dy 


1 

2 ! 

1 

3! 


nd 2 f d 2 f 9 a 2 /' 

ix — a) 2 —4r + 2(x — a\y — 6 )———V iy— b) 2 —j- 
dx z dxd y dy~ 

n 9 3 / 2 9 3 / 

ix -af-^ + 3[x - afiy - b) 

dx d dx z dy 


+ 3 (x - a)(y - by 


9 3 / „9 3 / 

dxdy 2 +(2/ - 6) ay 3 . 


(5.82) 


with all derivatives evaluated at the point (a, b ). Using ay = Xj — Xp, we may 
write the Taylor expansion for to independent variables in the form 


1 


9 


fix h ■■■,x m ) = ^— \^ j a i — ] fix t) 


n= 0 


ii=l 


dXi 


A convenient (m-dimensional) vector form is 

OO 1 

fir + a) = V —(a ■ VTfirf 
^ nl 


n= 0 


(5.83) 


SUMMARY 


The Taylor expansion generates all elementary functions and is of immense 
importance in physics applications, often, upon truncating it, serving as the 
start of controlled approximations. The Maclaurin series are special cases of 
Taylor expansions at the origin. 
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EXERCISES 


5 . 6.1 Show that 

°° r 2n+l 

(a)Sln *=£ ( “ ir (2^' 

OO 2n 

(b) cosa; = 

. ( 2n y 

In Section 6.1, e lx is defined by a series expansion such that 
e lx = cos x + i sin x. 

This is the basis for the polar representation of complex quantities. As 
a special case we find, with x — it, 

e in = -1. 


5 . 6.2 Derive a series expansion of cot x in increasing powers of a; by dividing 
cos a; by sin a;. 

Note. The resultant series that starts with 1/a; is actually a Laurent 
series (Section 6.5). Although the two series for sin a; and cos a; were 
valid for all x, the convergence of the series for cot x is limited by the 
zeros of the denominator, sin a; (see Section 6.5). 

5 . 6.3 Show by series expansion that 

1, Vo + 1 , _i , , . 

- In-- = coth )] 0 , \ri 0 \ > 1. 

2 77o—l 

This identity may be used to obtain a second solution for Legendre’s 
equation. 

5 . 6.4 Show that f(x) — x l/2 (a) has no Maclaurin expansion but (b) has a 

Taylor expansion about any point xo 0. Find the range of convergence 

of the Taylor expansion about x = xq. 


5 . 6.5 Let x be an approximation for a zero of f(pc) and A a - the correction. 
Show that by neglecting terms of order (Aa;) 2 


Aa; = — 


f(x) 

fix)' 


This is Newton’s formula for finding a root. Newton’s method has the 
virtues of illustrating series expansions and elementary calculus but is 
very treacherous. See Appendix A1 for details and an alternative. 


5 . 6.6 Expand a function T (x, y, z) by Taylor’s expansion. Evaluate <J>, the 
average value of O, averaged over a small cube of side a centered on 
the origin and show that the Laplacian of O is a measure of deviation 
of T> from 0(0, 0, 0). 


5 . 6.7 The ratio of two differentiable functions f(x) and g(pc) takes on the 
indeterminate form 0/0 at x = Xq. Using Taylor expansions prove 
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l’Hopital’s rule: 


f'ipc) 

urn = lim ——. 

*-*as o g(x) x-^x 0 g'(x) 


5 . 6.8 With n > 1, show that 


(a) - — In 
n 


n 


n— 1 


< 0, (b)-In 

n 


n+ 1 
n 


> 0. 


Use these inequalities to show that the limit defining the Euler- 
Mascheroni constant is finite. 

5 . 6.9 Expand (1 — 2 tz + < 2 )~ 1/2 in powers of t for any real number z so that 
|— 2tz+t 2 \ < 1. Assume that t is small. Collect the coefficients of t°, t 1 , 
and t 2 . 


ANS. a 0 = P 0 (z) = 1, 
ai = Pi(s) = z, 
a 2 = P 2 (_z) = |(3.s 2 - 1), 
where a n = P„(z), the nth Legendre polynomial. 


5 . 6.10 Using the double factorial notation of Exercise 5.2.12, show that 


(l + xT m/2 


£<-u 


n=0 


(m+ 2 to — 2 )!! 
2™!(m— 2)!! 


x 


m= 1, 2, 3, .... 


5 . 6.11 Evaluate the Fresnel integral [~ sin x 2 dx by a Taylor expansion. 

5 . 6.12 At CEBAF electrons are accelerat ed to re lativistic energies in the GeV 

range. Determine v/c from ^ 1 — where the accelerator volt¬ 

age V is in millions of Volts. Truncate the binomial series at a suitable 
order to find v/c to one significant decimal place for V = 200 million 
Volts. 


5 . 6.13 Find the Taylor expansion of the function oxpftan x). 

5 . 6.14 Verify the relativistic energy formula 


TOC 


TO 


E = — = TOC" + —V" 

yi — v 2 /c 2 2 


At what speed (in units of c) is there a 10% correction to the nonrela- 
tivistic kinetic energy? 


5 . 6.15 How much time dilation occurs during a 2-hr trip tracking at 65 miles 
per hour? 

5 . 6.16 A cosmic ray event is recorded with an energy of 3 x 10 20 eV. If the 
particle is a proton (electron), calculate the difference between its 
speed and the velocity of light. 
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5.6.17 Using binomial expansions, compare the three Doppler shift formulas: 

-l 


(a) v' = vl 1=F 

(b) v' = v( 1± 


(c) U = v(l±-)(1-- 


movmg source; 

moving observer; 

2 \ - 1/2 


relativistic. 


Note. The relativistic formula agrees with the classical formulas if terms 
of order v 2 /c 2 can be neglected. 

5.6.18 The relativistic sum w of two velocities u and v is given by 


If 


w u/c + v/c 
c l + uv/c 


V u 

c c 


.2 ■ 


where 0 < a < 1, find w/c in powers of a through terms in a 3 . 

5.6.19 The displacement a? of a particle of rest mass mo, resulting from a 
constant force m^g along the .r-axis, is 


c 

x= — 
9 


- 11/2 


including relativistic effects. Find the displacement rasa power series 
in time t. Compare with the classical result 



5.6.20 With binomial expansions 


x 

1 — X 


OO 


j>", 

n= 1 


X 

x — 1 


1 

1 - X- 1 


2>-. 

n=0 


Adding these two series yields J2™=-oo x " = 0- 

Hopefully, we can agree that this is nonsense, but what has gone wrong? 

5.6.21 (a) Planck’s theory of quantized oscillators leads to an average energy 


E”=i n£ o exp(— W£ 0 / kT) 

£“= o expC-reeo/fcr) 
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where eo is a fixed energy. Identify the numerator and denominator 
as binomial expansions and show that the ratio is 

( ) = £ o 

expOo/ZcT 1 ) — 1 

(b) Show that the (e> of part (a) reduces to kT, the classical result, for 
kT » eo- 

5.6.22 (a) Expand by the binomial theorem and integrate term by term to 
obtain the Gregory series for y — tan _I x (note tan y — x): 


, r x dt r 

tan x = / -- -x = 

Jo 1 + t 2 Jo 


= Ef- 1 )' 


X 


= / {1-r + r-r+ ■■•}* 

Jo 

2n+l 


n=0 


2 n+ V 


— 1 < x < 1. 


(b) By comparing series expansions, show that 


tan x = - In 
2 


1 — ix 
1 + ix 


Hint. Compare Exercise 5.4.1. 

5.6.23 In numerical analysis it is often convenient to approximate d 2 1 // (x)/dx 2 
by 

d 2 1 

—r\lf(x')^ —[if{x + K)-2^{x') + ^(x-h')\. 
dx* W 


Find the error in this approximation. 


ANS. Error = — iJr^Qc). 

I2 r v J 


5.6.24 You have a function y(x) tabulated at equally spaced values of the 
argument 


| Vn = y&n) 

| x n = x + nh. 


Show that the linear combination 


1 

12 h 


{-i /2 + 8?/i - 8y-i + y- 2 ] 


yields 


/ (5) 

*>“30* 


Hence, this linear combination yields y^ if (h 4 /30)yf^ and higher powers 
of h and higher derivatives of y(x) are negligible. 
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5.6.25 In a numerical integration of a partial differential equation the three- 
dimensional Laplacian is replaced by 

V 2 f(x, V, z) -»• h~ 2 [f(x + h, y, z) + ijr(x - h, y, z) 

+1 l/(x,y+h,z) + \l/(x,y- h, z) + \{r(x,y,z + K) 

+ i//(x, y,z — h) — 6 i \jr{x, y, 2)]. 

Determine the error in this approximation. Here, h is the step size, the 
distance between adjacent points in the x-, y-, or 2 -direction. 

5.6.26 Find the Taylor expansion of the error function erf(.x;) = -JL J)' er h dt 
and compare it to the asymptotic expansion of Section 5.10. How many 
terms are needed to obtain erf(0.1), erf(l), erf(2) for both cases? Use 
symbolic software or write a numerical program. 


5.7 Power Series 


The power series is a special and extremely useful type of infinite series of the 
form 

OO 

fix) = Gto + a\x + a 2 X 2 + a^x 3 H-= ^ a n x n , (5.84) 

ft =0 

where the coefficients a L are constants, independent of x. 10 


Equation (5.84) may readily be tested for convergence by either the Cauchy 
root test or the d’Alembert ratio test (Section 5.2). If 

lim = r- 1 ! (5.85) 

n—>oo a n 

the series converges for —R < x < R. This is the interval of convergence. 
Since the root and ratio tests fail when the limit is unity, the end points of the 
interval require special attention. 

For instance, if a n — nr 1 , then R = 1 and, from Sections 5.1-5.3, the series 
converges for x — — 1 but diverges for x — +1. If a n = n!, then R = 0 and the 
series diverges for all x ^ 0. 


Convergence 


Uniform and Absolute Convergence 


Suppose our power series [Eq. (5.84)] has been found convergent for — R < 
x < R; then it will be uniformly and absolutely convergent in any interior 
interval, — S < x < S, where 0 < S < R. 

This may be proved directly by the Weierstrass M test (Section 5.5). 


10 Equation (5.84) may be generalized to z = x + iy, replacing x. Chapters 6 and 7 deal with 
uniform convergence, integrability, and differentiability in a region of a complex plane in place of 
an interval on the a;-axis. 
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Continuity 


Since each of the terms u n (x) = a n x n is a continuous function of x and / (x) = 
Y\ a n x n converges uniformly for —S < x < S, fix') must be a continuous 
function in the interval of uniform convergence. 

This behavior is to be contrasted with the strikingly different behavior of 
the Fourier series (Chapter 14), in which the Fourier series is frequently used 
to represent discontinuous functions, such as sawtooth and square waves. 


Differentiation and Integration 


With n n (x) continuous and a n x n uniformly convergent, we find that the 
differentiated series is a power series with continuous functions and the same 
radius of convergence as the original series. The new factors introduced by 
differentiation (or integration) do not affect either the root or the ratio test. 
Therefore, our power series may be differentiated or integrated as often 
as desired within the interval of uniform convergence (Exercise 5.7.9). 

In view of the severe restrictions placed on differentiation of series (Sec¬ 
tion 5.5), this is a remarkable and valuable result. 


Uniqueness Theorem 


In the preceding section, using the Maclaurin series, we expanded e x and 
Ind+.r) into infinite series. In the succeeding chapters, functions are frequently 
represented or perhaps defined by infinite series. We now establish that the 
power-series representation is unique. 

If 


OO 

f(x) = ^2 a » X "’ -Ra < x < Ra 
n =0 
oo 

= y b n x n , —Rb < x < Rb, (5.86) 

n =0 

with overlapping intervals of convergence, including the origin, then 

d n = b n (5.87) 

for all to; that is, we assume two (different) power-series representations and 
then proceed to show that the two are actually identical. 

From Eq. (5.86), 


OO OO 

y, a n x n = y b n x n , —R < x < R, (5.88) 

n =0 n =0 

where R is the smaller of R a , Rb- By setting x — 0, to eliminate all but the 
constant terms, we obtain 


a 0 = b 0 . 


(5.89) 
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Now, exploiting the differentiability of our power series, we differentiate 
Eq. (5.88), obtaining 

OO OO 

y ^/na n x n ~ l = y ^/nb n x n ~ l , (5.90) 

n =1 n =1 

We again set x — 01 o isolate the new constant terms and find 


CLI = b\. 

By repeating this process n times, we get 

(5.91) 

a n = b n , 

(5.92) 


which shows that the two series coincide. Therefore, our power-series repre¬ 
sentation is unique. 

This will be a crucial point in Section 8.5, in which we use a power series 
to develop solutions of differential equations. This uniqueness of power series 
appears frequently in theoretical physics. The establishment of perturbation 
theory in quantum mechanics is one example. The power-series representation 
of functions is often useful in evaluating indeterminate forms, particularly 
when l’Hopital’s rule may be awkward to apply. 

The uniqueness of power series means that the coefficients a n may be 
identified with the derivatives in a Maclaurin series. From 

OO OO -« 

fix) = a n x n = £ - f n \0)x n (5.93) 

n=0 rc=0 U ‘ 

we have 

On = -J Qn \0). (5.94) 

n\ 



Inversion of Power Series 


Suppose we are given a series 


OO 

y- 2/0 = afx - Xo) + a 2 (x - xff H-= a n (x - x 0 ) n . 

n =1 


(5.95) 


This gives (y — yo) in terms of (x — xq). However, it may be desirable to have 
an explicit expression for (x — xo) in terms of (y— y 0 ). We may solve Eq. (5.95) 
for x — Xo by inversion of our series. Assume that 

OO 

x - Xo = bniy - y 0 ) n , (5.96) 

n= 1 

with the b n to be determined in terms of the assumed known a n . A brute-force 
approach, which is perfectly adequate for the first few coefficients, is simply 
to substitute Eq. (5.95) into Eq. (5.96). By equating coefficients of (x — Xo) n on 
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SUMMARY 


both sides of Eq. (5.96), since the power series is unique, we obtain 

1 U-2 1 / n v 

bi = &2 = — 3, b 3 = -5 [2al - a,a 3 ), (5.97) 

tt\ tty tty 

b\ = -t (oai cca;; — a 2 a 4 — 5a|), and so on. 

tty 

Some of the higher coefficients are listed by Dwight. 11 


Power series are Maclaurin series, that is, Taylor expansions at the origin. They 
can be integrated when they converge uniformly and differentiated when the 
differentiated series converges uniformly. Within the interval of convergence 
they can be added and subtracted. Multiplication is only safe when at least one 
of them converges absolutely. 

EXERCISES 


5.7.1 The classical Langevin theory of paramagnetism leads to an expression 
for the magnetic polarization 

, /cosha: 1 
P(x) = c 


sinhtr x , 

Expand P(x) as a power series for small x (low fields, high tempera¬ 
ture). 

5.7.2 The analysis of the diffraction pattern of a circular opening involves 

f* 2n 

cos(ccos (p)d(p. 


L 


0 

Expand the integrand in a series and integrate by using 

2jr 2n (2ri)\ 0 ^ 

cos i pd(p = „„.. • 2jt 


f 


f 


cos 


2 2n (n\f 

The result is 27r times the Bessel function Jo(c). 
5.7.3 Given that 

dx _| | IT 

——g = tan x\ t = 

1 + x z 4 


1 2n+1 (pd(p = 0. 


f 

Jo 


expand the integrand into a series and integrate term by term, 
obtaining 12 

■ + (- 1 )" 1 


it _ i 1 1 1 1 

4 ~ _ 3 + 5 _ 7 + 9 


2n+ 1 


11 Dwight, H. B. (1961). Tables of Integrals and Other Mathematical Data, 4th ed. Macmillan, 
New York. (Compare Formula No. 60.) 

12 The series expansion of tan -1 x (upper limit 1 replaced by x) was discovered by James Gregory 
in 1671, 3 years before Leibniz. See Peter Beckmann’s entertaining book, A History of Pi, 2nd ed. 
Golem Press, Boulder, CO (1971); and L. Berggren, J. Borwein, and P. Borwein, Pi: A Source Book. 
Springer, New York (1997). 
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which is Leibniz’s formula for jt. Compare the convergence (or lack of 
it) of the integrand series and the integrated series at x = 1. 


5.7.4 Expand the incomplete factorial function 


f 


e~H n dt 


in a series of powers of x. What is the range of convergence of the 
resulting series? 


L 


ANS. / e~H n dt = x n+1 


x 


x 


n+ 1 n+ 2 2 \(n+ 3) 

(-1 yx p 
pl(n+ p + 1) 


< oo. 


5.7.5 Evaluate 


(a) lim[sinflan.t;) — (an(sin.r)Ja; 7 , (b) lima; n j n {x ), n = 3, 

x —^0 a;—^0 

where j n (x) is a spherical Bessel function (Section 12.4) defined by 

"sina; N 


Ux) = i-iyv 1 ( 

xdx 


x 


ANS. (a)-l, (b)——- 
30 1 • 3 ■ 5 ■ • • (2 n + 1) 


1 

105 


for n = 3. 


5.7.6 Neutron transport theory gives the following expression for the inverse 
neutron diffusion length of k : 


a — b 
k 


tanh 1 



= 1 . 


By series inversion or otherwise, determine k 2 as a series of powers of 
b/a. Give the first two terms of the series. 

ANS. k 2 = Sab (1 — - 

\ 5 a) 

5.7.7 Develop a series expansion of y = sinh 1 x (i.e., sinh y — x) in powers 
of a; by 

(a) inversion of the series for sinh y, and 

(b) a direct Maclaurin expansion. 


5.7.8 A function f(z) is represented by a descending power series 

OO 

f(z) = ^ a n z~ n , R < z < oo. 

n =0 

Show that this series expansion is unique; that is, if / (z) = Y^=o b n z~ n , 
R < z < oo, then a n = b„ for all n. 


5.7.9 A power series converges for —R < x < R. Show that the differentiated 
series and the integrated series have the same interval of convergence. 
(Do not bother with the end points x — ±7?.) 
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5.7.10 Assume that f(x) may be expanded in a power series about the origin, 
fix') — Y.™=o a nX n , with some nonzero range of convergence. Use the 
techniques employed in proving uniqueness of series to show that your 
assumed series is a Maclaurin series: 

an = 

n\ 


5.7.11 Show that each of the following integrals equals Catalan’s constant: 

dx 


(a) 


f 


tan 


dt 

l T 


(b) 


f 


\nx- 


1 + x 


2 ' 


5.7.12 Use the integral test for the evaluation of finite series and show that 

n ^ 

(a) ^ m = -n(n+ 1). 

i ^ 

m= 1 

n i 

(b) Y m 2 = -n(n+ 1)(2 n+ 1). 

! 6 
m= 1 

n i 

(c) m 3 = -n 2 (n+ l) 2 . 

m= 1 

n | 

(d) m 4 = 7 ^ n i n + 1)(2 n+ l)(3n 2 + 3 n— 1). 

m= 1 


5.8 Elliptic Integrals 


Elliptic integrals are included here partly as an illustration of the use of power 
series and partly for their own intrinsic interest. This interest includes the 
occurrence of elliptic integrals in physical problems (Example 5.8.1 and Exer¬ 
cise 5.8.4) and applications in mathematical problems. 


EXAMPLE 5.8.1 


Period of a Simple Pendulum For small-amplitude oscillations our pen¬ 
dulum (Fig. 5.4) has simple harmonic motion with a period T = 2jr(l/g~) 1/2 . 
For a maximum amplitude 6m large enough so that sin <>m ^ 6m, Newton’s sec¬ 
ond law of motion (and Lagrange’s equation) leads to a nonlinear differential 
equation (sin 6 is a nonlinear function of 0), so we turn to a different approach. 

The swinging mass m has a kinetic energy of ml 2 (d0/dt) 2 /2 and a potential 
energy of —mgl cos 6 (6 = tt !‘2 taken for the arbitrary zero of potential energy). 
Since dO/dt = 0 at 0 — 6m, conservation of energy gives 



— mgl cos 6 = —mgl cos 6 M . 


Solving for (10/dt we obtain 

dd / 2a \ 1/2 1/9 

— = ± ( y J (cos 6 - cos 6m) ^ 


(5.98) 


(5.99) 
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with the mass mcanceling out. We take 1 to be zero, when 9 — 0 and dO/dt > 0. 
An integration from 9 = 0 to 6 = 0 M yields 


f 


(cos 6 — cos 6m) 


"MtH 




1/2 


(5.100) 


This is one-fourth of a cycle, and therefore the time t is one-fourth of the period, 
T. We note that 9 < 9m, and with a bit of clairvoyance we try the half-angle 
substitution 


sin 




With this, Eq. (5.100) becomes 


T = 




sin 2 ip 


- 1/2 


d(p. 


(5.101) 


(5.102) 


Although not an obvious improvement over Eq. (5.100), the integral now de¬ 
fines the complete elliptic integral of the first kind, K (sin 2 djf/2). From the 
series expansion, the period of our pendulum may be developed as a power 
series—powers of sin %/2: 


T — 2n 



1 + - sin 2 
4 


9m 

~2 



(5.103) 


Definitions 


Generalizing Example 5.8. 1 to include the upper limit as a variable, the elliptic 
integral of the first kind is defined as 


r<p 

F{(p\a ) = / (1 — sin 2 a sin 2 0) _1/2 d0 

Jo 


(5.104a) 
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or 

F(x\wi)= f [(1 - t 2 )(l - mt 2 )]- l/2 dt, 0<m<l. (5.104b) 

Jo 


[This is the notation of the National Bureau of Standards (1972). Handbook of 
Mathematical Functions, AMS-55. Dover, New York.] For <p = 7t/2, x = 1, we 
have the complete elliptic integral of the first kind, 


r n/t 

K(ni) = (1 — msin 2 0) _1/2 d0 

Jo 

= [ [(1 - £ 2 )(1 - mt 2 )]~ 1/2 dt, 
Jo 


with m = sin 2 a, 0 < m < 1. 

The elliptic integral of the second kind is defined by 

r<p 

E(q>\a) = (1 — sin 2 a sin 2 0) 1/2 d0 

Jo 


or 


2?(a;|m) = 


an 


1 — mt 2 \ 1/2 


t 2 


dt, 0 < m < 1. 


(5.105) 


(5.106a) 


(5.106b) 


Again, for the case <p — n /2, x = 1, we have the complete elliptic integral 
of the second kind: 



(1 - msin 2 0) 1/2 dd 


1 — mt 2 
l-t 2 



0 < TO < 


1. 


(5.107) 


Exercise 5.8.1 is an example of its occurrence. Figure 5.5 shows the behavior 
of K(ni) and E(m). Extensive tables are available in AMS-55. 


Series Expansion 


For our range 0 < m < 1, the denominator of K(m) may be expanded by the 
binomial series 


1 3 

(1 — TO.sin 2 0) -1/2 = 1 + -msin 2 9 + -m 2 sin 4 1 

2 8 


= E 


(2 n- 1)!! 
(2 w)!! 


to™ sin 2 ™0. 


(5.108) 


For any closed interval [0, rn max ], rn max < 1, this series is uniformly convergent 
and may be integrated term by term. From Exercise 5.7.2 



sin 2 ™ 6d0 


(2n—1)!! t r 
(2n)!! ' 2' 


(5.109) 
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Figure 5.5 

Complete Elliptic 
Integrals, K ( in ) and 
E(jn} 



Hence, 


Similarly, 


K(m) = f 1 + £ 


71=1 


(2 n- 1)!! 
(2 n)U 


m 


E(m) = | j 1 - 


(2 n- 1)!! 

(2 to )! 


n 2 


to " 


2 tc — 1 


(5.110) 


(5.111) 


(Exercise 5.8.2). These series are known as hypergeometric functions 2 -F 1 (a, b, 
c; .x), and we have 


K(m) = 


* F ( l 1 

2 2jPi 1 v 2’ 2’ 



(5.112) 


^(m) = ~iFi 



(5.113) 
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Limiting Values 


From the series Eqs. (5.110) and (5.111), or from the defining integrals, 

lim K(rti) = -, (5.114) 

2 


lim E(rri) = U -. (5.115) 

m -^-0 2 

For in —> 1 the series expansions are of little use. However, the integrals yield 

lim K(m) = oo, (5.116) 

m —>-1 

the integral diverging logarithmically, and 

lim E(m) = 1. (5.117) 

m -^-1 

The elliptic integrals have been used extensively for evaluating integrals. 
For instance, integrals of the form 

/ = y R (t, yj a 4 f 4 + + atf 2 + ait 1 + aojdt, 

where R is a rational function of / and of the radical, may be expressed in 
terms of elliptic integrals. Jahnke and Emde, Dover (1943) (Chapter 5) discuss 
such transformations in-depth. With high-speed computers available for direct 
numerical evaluation, interest in these elliptic integral techniques has declined. 
However, elliptic integrals remain of interest because of their appearance in 
physical problems (Exercises 5.8.4 and 5.8.5). 

For an extensive account of elliptic functions, integrals, and Jacobi theta 
functions, the reader is directed to Whittaker and Watson’s treatise A Course 
In Modern Analysis, 4th ed. Cambridge Univ. Press, Cambridge, UK (1962). 


EXERCISES 


5.8.1 The ellipse x 2 /a 2 + y 1 fir — 1 may be represented parametrically by 
x = a sin 0, y = b cos 0. Show that the length of arc within the first 
quadrant is 

pjr/2 

a I (1 — msin 2 d) 1/2 dd = aE(m). 

Jo 

Here, 0 < m = (a 2 — & 2 )/a 2 < 1. 


5.8.2 Derive the series expansion 


E(m) = | 




TO 

T 


/ 1 • 3\ 2 TO 2 

\JJl) ~3 


lim 

m—0 


(K-E) 

m 


71 

4' 


5.8.3 Show that 
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Figure 5.6 
Circular Wire Loop 



5.8.4 A circular loop of wire in the ar/-plane, as shown in Fig. 5.6, carries a 
current I. Given that the vector potential is 

a/j. 0 I r 71 cos a da 

f P ’ 2 jt J 0 (a 2 + p 2 + z 2 — 2 ap cos a) 1 / 2 ’ 


show that 


where 




k- = 


p 0 I ( a\ 1/2 f / k 2 


1 - - ) K(k^ - E 


Aap 


(a + p ) 2 + s 2 ' 

Note. For extension of Exercise 5.8.4 to B see Smythe. 13 


5.8.5 An analysis of the magnetic vector potential of a circular current loop 
leads to the expression 

/(fc 2 ) = AT 2 [(2 - k 2 )K(k 2 ) - 2 EQc 2 )], 


where K(k 2 ) and E(k A ) are the complete elliptic integrals of the first 
and second kinds. Show that for k 2 1 (r radius of loop) 


/(fc 2 ) « 


jtk 2 

IfT' 


13 Smythe, W. R. (1969). Static and Dynamic Electricity, 3rd ed., p. 270. McGraw-Hill, New York. 
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5.8.6 Show that 

dm 2 ) l , 

(a) = k lE ~ K(k )] ’ 


(b) 


dK(k 2 ) 


E 


K 


dk k(l -k 2 ) k ' 
Hint. For part (b) show that 


rx/* 

E(k 2 ) = (1 - k 2 ) / (1 — /c sin 2 9)~ 3/2 d9 

Jo 

by comparing series expansions. 


5.8.7 (a) Write a function subroutine that will compute E(m) from the series 
expansion, Eq. (5.111). 

(b) Test your function subroutine by using it to calculate E (to) over 
the range m = 0.0(0.1)0.9 and comparing the result with the values 
given by AMS-55. 


5.8.8 Repeat Exercise 5.8.7 for K(m). 

Note. These series for E(m) [Eq. (5.111)], and K (to) [Eq. (5.110)], con¬ 
verge only very slowly for m near 1. More rapidly converging series for 
E(m) and K(m) exist. See Dwight’s Tables ofIntegrals li Nos. 773.2 and 
774.2. Your computer subroutine for computing E and K probably uses 
polynomial approximations (AMS-55, Chapter 17). 


5.8.9 A simple pendulum is swinging with a maximum amplitude of 9m■ In 
the limit as 9 M -»■ 0, the period is 1 sec. Using the elliptic integral, 
K(k 2 ), k — sin(6t V f /2), calculate the period T for 0 M = 0 (10°) 90°. 
Caution. Some elliptic integral subroutines require k = m 1 /2 as an 
input parameter, not m. Check values. 


9m 

10° 

50° 

90° 

T (sec) 

1.00193 

1.05033 

1.18258 


5.8.10 Calculate the magnetic vector potential A (p, <p, z) = tpA v (p, <p, z) of a 
circular current loop (Exercise 5.8.4) for the ranges p/a = 2, 3, 4 and 
z/a — 0, 1, 2, 3, 4. 

Note. This elliptic integral calculation of the magnetic vector potential 
may be checked by an associated Legendre function calculation. 


Check value. For p/a = 3 and z/a — 0; A v = 0.029023po I. 


5.9 Bernoulli Numbers and the Euler-Maclaurin Formula 


The Bernoulli numbers were introduced by Jacques Bernoulli. There are sev¬ 
eral equivalent definitions, but extreme care must be taken because some 
authors introduce variations in numbering or in algebraic signs. One relatively 


14 Dwight, H. B. (1947). Tables of Integrals and Other Mathematical Data. Macmillan, New York. 
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simple approach is to define the Bernoulli numbers by the series 

ry °° U nr n 

<Aj > -LJ YltAj 

e x — 1 “ n\ ’ 

n= 0 


15 


(5.118) 


which converges for \x\ < 2tt. By differentiating this power series repeatedly 
and then setting x = 0, we obtain 


Bn = 


' cL n 

( x w 

_dx n 

{e x -l)_ 


X=0 


Specifically, 

Bi 


d 

( x \ 

1 

xe x 

dx 

\e x - 1/ 

t-H 

1 

05 

1 

o 

a 

(e x - l) 2 




(5.119) 


(5.120) 


as may be seen by series expansion of the denominators. 

Using JE?o = 1 and B\ = — ^, it is easy to verify that the function 


x 


1 


- 1 


ryt ryXL ry 

iAs ^ N * A/ 

2 = = ~ e~ x — 1 ~ 

n= 2 


X 

2 


(5.121) 


is even in x so that all B- 2 n +\ = 0. 

To derive a recursion relation for the Bernoulli numbers, we multiply 


1 x 


x e u 


h- 1 - £ 

I rz=2 


00 

^ (m+ 1)! 


X V-\ 

i - o + E ^ 


a; 


2n 


72=1 


(2n)! 


= i+E 


a; 




l 


E 

N= 2 l<n<N/2 


1 

(m+1)! 2m! 

B'2n 


(2ri)\(N -2n+ 1)! 


For IV > 0 the coefficient of x N is zero so that Eq. (5.122) yields 


£ ( jv+i)-i= y b A 

Z l<n<N/2 \ 


N+ 1 
2 n 




(5.122) 


(5.123) 


which is equivalent to 


1 N l 

N-l 

N- 1 = Y B 


n= 1 


21V + 1 
2n 

21V \ 
2m ]' 


(5.124) 


15 The function x/(e x — 1) may be considered a generating function since it generates the 
Bernoulli numbers. Generating functions of the special functions of mathematical physics are 
discussed in Chapters 11-13. 
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Table 5.1 

n 


B n 

Bernoulli Numbers 

0 

1 

1.0000 00000 

Note: Additional values are 

1 

-1/2 

-0.5000 00000 

given in the National 

2 

1/6 

0.1666 66667 

Bureau of Standards 

4 

-1/30 

-0.0333 33333 

Handbook of Mathematical 

6 

1/42 

0.0238 09524 

Functions (AMS-55). 

8 

-1/30 

-0.0333 33333 


10 

5/66 

0.0757 57576 


From Eq. (5.124) the Bernoulli numbers in Table 5.1 are readily obtained. If 
the variable x in Eq. (5.118) is replaced by 2 ix, we obtain an alternate (and 
equivalent) definition of B in (B\ is set equal to — i) by the expression 

°°' (2iX)^ n 

xcotx = T (— \y i B 2 n rn . , — n < x < it. (5.125) 

to Pa¬ 

using the method of residues (Section 7.2) or working from the infinite product 
representation of sin a;, we find that 

(—l)"->2(2n)! ^ 

Bzn — 2-^i P ’ tt — 1, 2, <3, .... (5.12b) 

P = i 

This representation of the Bernoulli numbers was discovered by Euler. It is 
readily seen from Eq. (5.126) that \B 2 n \ increases without limit as n -* oo. 
Numerical values have been calculated by Glaisher. 16 Illustrating the divergent 
behavior of the Bernoulli numbers, we have 

B 20 = -5.291 x 10 2 


B- 2 oo = -3.647 x 10 215 . 


Some authors prefer to define the Bernoulli numbers with a modified version 
of Eq. (5.126) by using 


2(2n)! ^ _ 

B "=(2^g J> 


2 n 


(5.127) 


with the subscript being just half of our subscript and all signs are positive. 
Again, when using other texts or references the reader must check carefully 
to see how the Bernoulli numbers are defined. 

The Bernoulli numbers occur frequently in number theory. The von Staudt- 
Clausen theorem states that 


1 

B 2n = A n 

Pi 


1 1 

P2 P:l 



(5.128) 


lfl GIaisher, J. W. L. Table of the first 250 Bernoulli’s numbers (to nine figures) and their logarithms 
(to ten figures). Trans. Cambridge Philos. Soc. 12, 390 (1871-1879). 
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in which A n is an integer and Pi, P 2 , ■■■, Pk are prime numbers so that p, — 1 
is a divisor of 2 n. It may readily be verified that this holds for 

B 6 (A 3 = 1, p= 2,3,7), 

B h (A 4 = 1, p= 2,3,5), (5.129) 

Bw(A 5 = 1, p — 2, 3, 11), 


and other special cases. 

The Bernoulli numbers appear in the summation of integral powers of the 
integers, 

N 

P integral, 

3 =1 


and in numerous series expansions of the transcendental functions, includ¬ 
ing tan x, cot a:, In | sin.x|, (sin a;) -1 , In | cosa;|, In | tanx\, (cosh a;) -1 , tanh a;, and 
coth x. For example, 


tana; = x + 


x 


15 


-x 


(1)»- i 2 2»( 2 2 »* i^ B2 


(2m)! 


a; 2 ™ -1 


+ (5.130) 


The Bernoulli numbers are likely to come in such series expansions because 
of the defining Eqs. (5.118) and (5.124) and because of their relation to the 
Riemann zeta function, 


C(2 n) = J2 P 2n - 

p=i 


(5.131) 



Bernoulli Functions 

If Eq. (5.118) is generalized slightly, we have 


xer 


' npiv 

n= 0 


(5.132) 


defining the Bernoulli functions, B n (s). The first seven Bernoulli functions 
are given in Table 5.2. 

From the generating function, Eq. (5.132), 


B n (0) = B n , n = 0,1,2,..., (5.133) 


Table 5.2 

Bernoulli Functions 


Bo 

= 1 





Bi 

= x- 

1 

2 




b 2 

= X 2 

— x + 

1 

6 



Bo 

= X s 

- -r 2 

+ 

2 X 


B 4 

= X 4 

- 2a; 3 

+ 

x 2 — 

1 

30 

Bo 

= X 5 

5~4 

+ 

5 ™3 
3 X 

- gX 

Bo 

= X 6 

- 3x 6 

+ 

5^,4 

2 X 

_ l r 2 + J, 

2 X '42 

BJO) 

= B n , 

Bernoulli number 
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the Bernoulli function evaluated at zero equals the corresponding Bernoulli 
number. Two particularly important properties of the Bernoulli functions fol¬ 
low from the defining relation, Eq. (5.132): a differentiation relation 

J-B n (s) = nB n _i(s'), n = 1, 2, 3, ..., (5.134) 

as 

and a symmetry relation [replace x -» —x in Eq. (5.132), then set s — 1] 

Bn(X) = (—l)”Bn(0), n = 1,2,3,.... (5.135) 

These relations are used next in the development of the Euler-Maclaurin inte¬ 
gration formula. 


Euler-Maclaurin Integration Formula 


One use of the Bernoulli functions is in the derivation of the Euler-Maclaurin 
integration formula. This formula is used in Section 10.3 for the development 
of an asymptotic expression for the factorial function—Stirling’s series. 

The technique is repeated integration by parts using Eq. (5.134) to create 
new derivatives. We start with 


f f(x)dx = f 
Jo Jo 


f(x)B 0 (x)dx. 


(5.136) 


From Eq. (5.134), 


B[(x) — Bo{pc) — 1. 


(5.137) 


Substituting B[(pc ) into Eq. (5.136) and integrating by parts, we obtain 


f 


f{x)dx = - /(0)Bi(0) 


-i'r 


(pc)Bi(pc)dx 


= §[/a )+im - jf 1 nx)B x (x)dx. 


(5.138) 


Again, using Eq. (5.134), we have 


B\(x) 


1 

2 


B' 2 (x), 


(5.139) 


and integrating by parts 


/' 


f(x)dx 


\[K 1) + /(0)] - - f(0)B 2 (0)] 

If 1 

+ 2! J f {2 \x)B 2 (x)dx. 


(5.140) 
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Using the relations 


BzniX) = -®2re(0) = B-2n, U = 0, 1 , 2, . . . 
B‘Zn+1 (1) = ^2m+l(0) = 0, TO = 1, 2, 3, . . . 


(5.141) 


and continuing this process, we have 




B 2p [f (2p - 1 \l) - / (2p_1) (0)] 



(5.142a) 


This is the Euler-Maclaurin integration formula. It assumes that the function 
f{pc) has the required derivatives. 

The range of integration in Eq. (5.142a) may be shifted from [0, 1] to [1, 2] 
by replacing f(x) by f(pc + 1). Adding such results up to [to — 1, to], 





(5.142b) 


The terms |/(0) + /(l) + • • • + |/(to) appear exactly as in trapezoidal 
integration or quadrature. The summation over p may be interpreted as a 
correction to the trapezoidal approximation. Equation (5.142b) may be seen as 
a generalization of Eq. (5.22); it is the form used in Exercise 5.9.5 for summing 
positive powers of integers and in Section 10.3 for the derivation of Stirling’s 
formula. 

The Euler-Maclaurin formula is often useful in summing series by convert¬ 
ing them to integrals. 17 


P Improvement of Convergence 


This section has so far been concerned with establishing convergence as an 
abstract mathematical property. In practice, the rate of convergence may be 
of considerable importance. Here, we present one method of improving the 
rate of convergence of a convergent series. 

The basic principle of this method, due to Rummer, is to form a linear 
combination of our slowly converging series and one or more series whose 


17 For a number of examples, see Boas, R. P., and Stutz, C. (1971). Estimating sums with integrals. 
Am. J. Phys. 39, 745. 
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sum is known. For the known series, the collection 


Oil 


0/2 


0/3 


OO 


E 

n =1 


l 

n(n + 1) 


= 1 


E 


n= 1 


i 

n(n + 1 ~)(n + 2) 


1 

4 


OO 


E 

n =1 


l 

n(n + 1 )(n + 2)(n + 3) 


1 

18 


a 


p 


OO 


E 

n =1 


i 

n(n + 1) ■ • ■ (n + p) 


1 

p ■ p\ 


is particularly useful. 18 The series are combined term by term and the coeffi¬ 
cients in the linear combination chosen to cancel the most slowly converging 
terms. 


EXAMPLE 5.9.1 


Riemann Zeta Function, £(3) Let the series to be summed be n 3 . 
In Section 5.2, this is identified as the Riemann zeta function, £(3). We form a 
linear combination (using <ij defined above) 


OO 

+ a 2 oi2 

n= 1 


E m 3 

72=1 


a 2 
~4 ' 


is not included, since it converges more slowly than £(3). Combining terms, 
we obtain on the left-hand side 


OO 


E 

72=1 


1 

n 3 


a 2 | = 

vin+ l)(n+2) J f-i 


n 2 (l + a 2 ) + 3n + 2 
n 3 (n+ 1)(tc+ 2) 


If we choose a 2 — — 1, the preceding equations yield 


C (3) = J >- 3 

72=1 


1 

4 


+ E 


n= 1 


3n+ 2 

n 3 (n + 1 )(n + 2) 


(5.143) 


The resulting series may not be beautiful, but it does converge as nr 4 , faster 
than rr 3 . A more convenient form is shown in Exercise 5.9.21. There, the 
symmetry leads to convergence as rr 5 . 

The method can be extended, including 0,3013 to get convergence as rr 5 , 
040/4 to get convergence as rr 6 , and so on. Eventually, you have to reach a 
compromise between how much algebra you do and how much arithmetic the 
computer does. As computers get faster, the balance is steadily shifting to less 
algebra for you and more arithmetic for them. ■ 


ls These series sums may be verified by expanding the forms by partial fractions, writing out the 
initial terms, and inspecting the pattern of cancellation of positive and negative terms. 
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Improvement of Convergence by Rational Approximations 

The series 


ln(l + x) = y^(— 1)” 1 —, — 1 < x < 1 

' n 


n =1 


(5.144) 


converges very slowly as x approaches +1. The rate of convergence may be 
improved substantially by multiplying both sides of Eq. (5.144) by a polynomial 
and adjusting the polynomial coefficients to cancel the more slowly converging 
portions of the series. Consider the simplest possibility: Multiply ln(l + x ) by 
1 + a\x: 


_ ^ ry*'t ^ 

(1 + aiir)ln(l + x) = ^(-l)” -1 — + cq ^(-1)' 


n- 


n 


n =1 n =1 

Combining the two series on the right term by term, we obtain 

eq 

t-AJ" ‘I- 

\ r V) -y 

n =2 


°° / I 

(1 + <qa;)ln(l + x) = x+ ^(— 1) M_1 I - 


n n— 1 


x 


= x+T(-i r- i nC 1 ~ a{) ~ 1 x n . 

n{n— 1) 


(5.145) 


(5.146) 


n =2 


Clearly, if we take oi = 1, the n in the numerator disappears and our combined 
series converges as nr 2 . 

Continuing this process, we find that the general term of (1 + 2x + x 2 )- 
ln(l + x) vanishes as n~ 3 and that of (1 + 3.x + 3.r 2 + x :i ) ln(l + x ) vanishes as 
n~ A . In effect we are shifting from a simple series expansion of Eq. (5.144) to a 
rational fraction representation in which the function ln(l + x) is represented 
by the ratio of a series and a polynomial: 


x+ ^(—1 ) m x n /[n(n— 1)] 

ln(l +x)= -—---. (5.147) 

1 + x 

Such rational approximations may be both compact and accurate. Computer 
subroutines make extensive use of such methods. 

If we are required to sum a convergent series a » whose terms are 
rational functions of n, the convergence may be improved dramatically by 
introducing the Riemann zeta function. 


EXAMPLE 5.9.2 


Improvement of Convergence The problem is to evaluate the series 
1 1/(1 + n 2 ). Expanding (1 + n 2 ) -1 = n~ 2 (l + n -2 ) -1 by direct division, 
we have 


(l + w 2 )- 1 



n 6 \ 

1 + to -2 / 


1 1 


to 2 n A 


1 

rt 6 


1 


n 8 + n 6 ' 
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Therefore, 


OO -j OO 1 

n =1 n =1 

The f values are tabulated and the remainder series converges as n~ 8 . Clearly, 
the process can be continued as desired. You make a choice between how 
much algebra you will do and how much arithmetic the computer will do. ■ 


EXERCISES 
5 . 9.1 Show that 


OO 

tan x = ^2 

n= 1 


(_ l)»- 1 2 2 »(2 2m - 1 )B 2 „ 
(2ri)\ 


j. 2m-1 


TV 

2 


< x < 


TC 

2 ' 


Hint, tan x = cot x — 2 cot 2x. 

5 . 9.2 Show that the first Bernoulli polynomials are 

Bo(s) = 1 

-Bits) = s- ^ 

B 2 (s) = s 2 -s+l- 
b 

Note that B n (0) = B n , the Bernoulli number. 

5 . 9.3 Show that B' n (s) = nB n _\(s), n = 1, 2, 3,... 
Hint. Differentiate Eq. (5.132). 

5 . 9.4 Show that 


B n i 1) = (-l) n B„(0). 


Hint. Go back to the generating function, Eq. (5.132) or Exercise 5.9.2. 

5 . 9.5 The Euler-Maclaurin integration formula may be used for the evalua¬ 
tion of finite series: 

n C n 11 D 

V f(jn) = / f(x)dx + -/(1) + -f{n) + -f [fin) - /'(1)] + • • •. 

m=1 Jo 2 2 2! 

Show that 

n y 

(a) 'Y2 m = -n(n + 1). 

m= 1 

n i 

(b) m 2 = -n(n+ l)(2n+ 1). 

“ 6 

m=l 
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n 1 

(c) ^ m 3 = -n 2 (n+ l) 2 . 

m =1 4 

« , 

(d) V' ™ 4 = —n(re + l)(2n + l)(3n 2 + 3n — 1). 

. oU 

m= 1 


5.9.6 From 


show that 


B 2n = (-I )" -1 


2 ( 2 tc )! 

(2tt) 2 ™ 


4T(2rO, 


7T 2 

(a) £(2) = — 

o 

;r 4 

(b) £(4) = — 
^ J w 90 


(c) £(6) = 



(d) res) = 

(e) £(10) = 


9,450 


93,555' 


5.9.7 Planck’s black-body radiation law involves the integral 

f°° x?dx 

Jo e x — l 

Show that this equals 6£(4). From Exercise 5.9.6 


£(4) = 


TC 


4 


90' 


Hint. Make use of the gamma function (Chapter 10). 


5.9.8 Prove that 



x n e x dx 
(e x - l) 2 


= n!£(n). 


Assuming n to be real, show that each side of the equation diverges if 
n— 1. Hence, the preceding equation carries the condition n > 1. Inte¬ 
grals such as this appear in the quantum theory of transport effects— 
thermal and electrical conductivity. 


5.9.9 The Bloch-Griineissen approximation for the resistance in a monova¬ 
lent metal is 


rp 5 y-e/r x 5 dx 

P = Jo (e*-l)(l -e~ x Y 


where 0 is the Debye temperature characteristic of the metal. 
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(a) For T —*■ oo, show that 

(b) For T —»■ 0, show that 


P 


C T 
4 ' ©2' 


P 


5!£(5)C 


©6' 


5.9.10 Show that 

" 1 ln(l + ar) 


(a) 


/ 




(b) lim 


iJo 


ln(l — x) 


dx = £(2). 


lo x 2 a —>1 Jo X 

From Exercise 5.9.6, £(2) = jt 2 /6. Note that the integrand in part (b) 
diverges for a = 1, but that the integrated series is convergent. 


5.9.11 The integral 

C l o dx 

/ [ln(l - x')] 2 — 

Jo X 

appears in the fourth-order correction to the magnetic moment of the 
electron. Show that it equals 2f (3). 

Hint. Let 1 — x — e~ l . 


5.9.12 Show that 


f 


(lns)- 


dz = 4 


i _1 1_1 

3 3 + 5 3 7 3 


By contour integration (Exercises 7.2.16), this may be shown equal to 
n 3 /8. 


5.9.13 For “small” values of x, 

00 

ln(*!) = -yx + J"(i-iy—x n , 

“ n 

n =2 

where y is the Euler-Mascheroni constant and f (n) the Riemann zeta 
function. For what values of x does this series converge? 


ANS. —l<x<l. 


Note that if x — 1, we obtain 


r = 


n =2 


.X(n) 


n 


a series for the Euler-Mascheroni constant. The convergence of this 
series is exceedingly slow. For actual computation of y, other, indirect 
approaches are far superior (see Exercises 5.10.11 and 10.5.16). 


5.9.14 Show that the series expansion of ln(:r!) (Exercise 5.9.13) may be writ¬ 
ten as 

t;(2n+ 1) 2 rt+i 

Jb 


(a) ln(;r!) = 


lin(^-)-yx-jr 

2 V sm jtx 1 z —! 

x 7 n= 1 


2 n+ 1 
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Table 5.3 

Riemann Zeta Function 


s 

CO) 

2 

1.6449340668 

3 

1.2020569032 

4 

1.0823232337 

5 

1.0369277551 

6 

1.0173430620 

7 

1.0083492774 

8 

1.0040773562 

9 

1.0020083928 

10 

1.0009945751 


(b) liibr'i = - 


2 \sin7ra;/ 2 \l-x) 


y)x 


£>(2 W +1)-1] 


X ' 


2n+l 


n =1 


2 n + 1 

Determine the range of convergence of each of these expressions. 

5.9.15 Show that Catalan’s constant, /J(2), may be written as 


/)(2) = 2 ]T(4fc - 3r 2 


/C=l 


7T 

~8~ 


Hint. Tt 2 = (2). 

5.9.16 Derive the following expansions of the Debye functions for n > 1: 


t n at 


JU. 

f 00 t n dt 

Jx @ 


= X 


X 


E 




2A: 


at ^ 

~ E 


n 2 (tc+1) “ ( 2 k + ri)( 2 k)\ 


n(n — Y)x n 


~—kx 


k=l 


xnx 
~k 


k 2 


k 3 


+ 


< 2tc, 
n\ 


k n+1 


for x > 0. The complete integral (0, oo) equals n!f (to + 1). 

5.9.17 (a) Show that the equation In 2 = £]“ 1 (—l) s+1 s _1 (Exercise 5.4.1) may 
be rewritten as 


In 2 = J 2 2 -s £(s) + ^(2p)-“- 1 

P= 1 


s—2 


i~ — 

2 p 


Hint. Take the terms in pairs. 

(b) Calculate In 2 to six significant figures. 

5.9.18 (a) Show that the equation jt/4 = J^'^ = 1 (—l) n+ 1 (2n— l) -1 (Exercise 

5.7.3) may be rewritten as 

OO OO 

- = 1 - 2 J 2 4 _2s C(2s) - 2 ^(4 p )~ 2n ~ 2 

4 s=l p=1 

(b) Calculate 7r/4 to six significant figures. 

5.9.19 Write a function subprogram ZETA(A r ) that will calculate the Riemann 
zeta frmction for integer argument. Tabulate £(s) for s — 2, 3, 4, ..., 20. 
Check your values against Table 5.3 and AMS-55 (Chapter 23). 

Hint. If you supply the function subprogram with the known values of 
f (2), £(3), and £(4), you avoid the more slowly converging series. 



5.9.20 Calculate the logarithm (base 10) of \Bz n \, to = 10, 20, ..., 100. 
Hint. Program £(to) as a function subprogram (Exercise 5.9.19). 


Check values, log |5i 00 | = 78.45 
log |£ 20 o| = 215.56. 
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5 . 9.21 Determine the values of the coefficients eq, ( 12 , and a 3 that will make 
(1 + a\x + 0 , 2 .x 2 + a,-\X v ) ln(l + x) converge as nr 4 . Find the resulting 
series. 



Asymptotic series frequently occur in physics. In numerical computations they 
are employed for the accurate computation of a variety of functions. We con¬ 
sider here a type of integral that leads to asymptotic series—an integral of the 
form 



where the variable x appears as the lower limit of an integral. 

Asymptotic series often occur as solutions of differential equations. An ex¬ 
ample of this appears in Section 12.3, as a solution of Bessel’s equation. One of 
the most useful and powerful methods of generating asymptotic expansions, 
the method of steepest descents, will be developed in Section 7.3. Applications 
include the derivation of Stirling’s formula for the (complete) factorial func¬ 
tion (Section 10.3) and the asymptotic forms of the various Bessel functions 
(Section 12.3). Asymptotic series occur fairly often in mathematical physics. 
One of the earliest and most important approximations of quantum mechanics, 
the WKB expansion, is an asymptotic series. 


Error Function 


The nature of an asymptotic series is perhaps best illustrated by a specific 
example. Let us start with the familiar error function erf(x) and its complement 
erfc(.x), 



(5.148) 


which have many applications in statistics, probability theory, and error analy¬ 
ses in physics and engineering and are normalized so that erf(oo) = 1. Suppose 
we have to evaluate the error function for large values of x. Then we express 
erf(.x) = 1 — erfc(.x), because erfc(.x) is better suited for an asymptotic expan¬ 
sion, which we now generate by repeated integration by parts of 
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Once we recognize the emerging pattern 






(2v - 1)!! 

2 v x 2v 


(- 1 )" 


-(-!)" 


(2 n+ 1 )!! 
2«+i 



(5.149) 


from the first two or three terms, we can prove it by mathematical induction 
by one more integration by parts of the last integral term 

1 r°° _ (2 dt e~ t2 00 2m + 3 r°° dt 

~2] x { ~ 2t)e ^ 2 “ J x e ^ 4 ' 


Putting this result back into Eq. (5.149), we generate the integral term for 
n —> n + 1 and the general term of the sum in Eq. (5.149) for n —*■ n + 1, thus 
proving the expansion for all natural integers. Substituting Eq. (5.149) into 
Eq. (5.148), we obtain the error function 


2 r x 2 

crf(.r) = —— / e ’ (It 
J o 

= i-£l( r i-J- + V4-i^ + ... + ( -i) ^- 1 )n 

\ 2a; 2 2 2 a; 4 2 3 a; 6 2 m a^™ / 


+ (-!)’ 


,( 2 ™+ 1 )!! 

2 n s [n 


f 


i dt 
^2n+2 


(5.150) 


in terms of a finite (partial) series and an integral as a remainder term. 

This is a remarkable series. Checking the convergence by the d’Alembert 
ratio test, 


lim 

w—>-oo 


\u n +i I 
l«™l 


= lim 

w—>-oo 


2n+ 1 
2 x 2 


= oo 


(5.151) 


for all values of a;. Therefore, our series as an infinite series diverges every¬ 
where. Before discarding it as worthless, let us see how well a given partial 
sum approximates the error function. When we substitute t = x+ v/x in the 
remainder, the integral becomes 


— [ g-OW*) 2 _^_ 

x J 0 (x+ v/x ') 2n+2 



g—2v—v 2 /x 2 


d v 

(1 + v/x 2 ) 2n+2 ' 


For fixed n and large x, the integral is less than / 0 °° e 2v dv = 1/2. Thus, the 
error function is equal to the partial sum up to an error that is less than 


1 ( 2 m + 1 )!! e~ x2 
y/7T 2 B+1 X 2n+2 ’ 


which goes to zero as x —*■ oo for all natural integers n. This means that if 
we take x large enough, our partial sum in Eq. (5.150) is an arbitrarily good 
approximation to the error function. Although formally divergent when ex¬ 
tended to m —> oo, any partial sum is perfectly good for computations. For this 
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SUMMARY 


reason, it is sometimes called a semiconvergent series. Note that such asymp¬ 
totic series are always finite sums and never infinite series. Because 
the remainder integral in Eq. (5.150) alternates in sign, the successive partial 
sums give alternatively upper and lower bounds for the error function. 


Asymptotic series are approximations by one or more terms of a finite series 
of certain integrals (or integral representations of special functions) at special 
values of their argument. They are important tools for accurate approxima¬ 
tions. 


EXERCISES 


5.10.1 Stirling’s formula for the logarithm of the factorial function is 
ln(a;!) = ^ ln27r + Lr + ^jlnx — x + 0(a; _1 ). 
Show that Stirling’s formula is an asymptotic expression. 


5.10.2 Integrating by parts, develop asymptotic expansions of the Fresnel 


integrals: 

(a) C(x) = f 
Jo 


7TU 2 

cos- du 

2 


r>X 

(b)s(aO = 

Jo 


jtu 2 

sin- du. 

2 


These integrals appear in the analysis of a knife-edge diffraction 
pattern. 


5.10.3 Derive the asymptotic expansions of (fi (a;) and si(.r) by repeated inte¬ 
gration by parts. 

Hint. Ci ( x ) + i si (x) = — /“ ydt. 

5.10.4 Evaluate the Gauss error function at x = 1, 2, 3 using the asymptotic 
series [Eq. (5.150)]. Choose n so as to obtain an accuracy of 1, 4, 8 
significant decimal places. 


5.10.5 For x > 1 


1 

1 + x 


OO 


£c-u 

n= 0 


l 


Test this series to determine if it is an asymptotic series. 


5.10.6 Develop an asymptotic series for 

) 

e~ xv (l + v 2 y 2 dv. 

Take x to be real and positive. 


£ 


1 2! 4! 

-3 H-5 

ry ryO ryO 

tAy tAy tAy 


(- 1) K (2 ri )\ 


ANS. 
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Chapter 6 


Functions of a 
Complex Variable I 

Analytic Properties Mapping 


The imaginary numbers are a wonderful 
flight of God’s spirit; they are almost an 
amphibian between being and not being. 

—Gottfried Wilhelm von Leibniz, 1702 


The theory of functions of one complex variable contains some of the most 
powerful and widely useful tools in all of mathematical analysis. To indicate 
why complex variables are important, we mention briefly several areas of 
application. 

First, for many pairs of functions u and v, both u and i> satisfy Laplace’s 
equation in two real dimensions 

d 2 u(x, y) d 2 u(x, y) 

V 2 u =-= 0. 

3a; 2 3 y 2 

For example, either u or v may be used to describe a two-dimensional 
electrostatic potential. The other function then gives a family of curves or¬ 
thogonal to the equipotential curves of the first function and may be used to 
describe the electric field E. A similar situation holds for the hydrodynamics of 
an ideal fluid in irrotational motion. The function u might describe the velocity 
potential, whereas the function v would then be the stream function. 

In some cases in which the functions u and v are unknown, mapping or 
transforming complex variables permits us to create a (curved) coordinate 
system tailored to the particular problem. 

Second, complex numbers are constructed (in Section 6.1) from pairs of 
real numbers so that the real number field is embedded naturally in the com¬ 
plex number field. In mathematical terms, the complex number field is an 
extension of the real number field, and the latter is complete in the sense that 
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any polynomial of order n has n (in general) complex zeros. This fact was 
first proved by Gauss and is called the fundamental theorem of algebra (see 
Sections 6.4 and 7.2). As a consequence, real functions, infinite real series, and 
integrals usually can be generalized naturally to complex numbers simply by 
replacing a real variable x, for example, by complex z. 

In Chapter 8, we shall see that the second-order differential equations of 
interest in physics may be solved by power series. The same power series 
may be used by replacing x by the complex variable z. The dependence of 
the solution f(z) at a given zq on the behavior of f(z) elsewhere gives us 
greater insight into the behavior of our solution and a powerful tool (analytic 
continuation) for extending the region in which the solution is valid. 

Third, the change of a parameter k from real to imaginary transforms 
the Helmholtz equation into the diffusion equation. The same change trans¬ 
forms the Helmholtz equation solutions (Bessel and spherical Bessel func¬ 
tions) into the diffusion equation solutions (modified Bessel and modified 
spherical Bessel functions). 

Fourth, integrals in the complex plane have a wide variety of useful 
applications: 

• Evaluating definite integrals (in Section 7.2) 

• Inverting power series 

• Infinite product representations of analytic functions (in Section 7.2) 

• Obtaining solutions of differential equations for large values of some vari¬ 
able (asymptotic solutions in Section 7.3) 

• Investigating the stability of potentially oscillatory systems 

• Inverting integral transforms (in Chapter 15) 

Finally, many physical quantities that were originally real become complex 
as a simple physical theory is made more general. The real index of refraction 
of light becomes a complex quantity when absorption is included. The real 
energy associated with an energy level becomes complex, E = m± ir, when 
the finite lifetime of the level is considered. Electric circuits with resistance 
R, capacitance C, and inductance L typically lead to a complex impedance 
Z — R + i(a>L — ^). 

We start with complex arithmetic in Section 6.1 and then introduce complex 
functions and their derivatives in Section 6.2. This leads to the fundamental 
Cauchy integral formula in Sections 6.3 and 6.4; analytic continuation, singu¬ 
larities, and Taylor and Laurent expansions of functions in Section 6.5; and 
conformal mapping, branch point singularities, and multivalent functions in 
Sections 6.6 and 6.7. 


6.1 Complex Algebra 


As we know from practice with solving real quadratic equations for their 
real zeros, they often fail to yield a solution. A case in point is the following 
example. 
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Positive Quadratic Form For all real x 

y{x) = x 2 + x+ 1 = ( x +\^ 



is positive definite; that is, in the real number field y(x) — 0 has no solutions. 
Of course, if we use the symbol i = V—l, we can formally write the solutions 
of y{x) — 0 as g(— 1 ± i\J 3) and check that 



Although we can do arithmetic with i subject to the rule i 2 = — 1, this symbol 
does not tell us what imaginary numbers really are. ■ 

In order to make complex zeros visible we have to enlarge the real numbers 
on a line to complex numbers in a plane. We define a complex number such 
as a point with two coordinates in the Euclidean plane as an ordered pair of 
two real numbers, (a, b ) as shown in Fig. 6.1. Similarly, a complex variable 
is an ordered pair of two real variables, 


z = (pc, y)- 


( 6 . 1 ) 


The ordering is significant; x is called the real part of z and y the imaginary 
part of z. In general, (a, b) is not equal to ( b, a) and {pc, y) is not equal to 
(y, x). As usual, we continue writing a real number {x, 0) simply as x, and 
call i = (0, 1) the imaginary unit. The .x-axis is the real axis and the y -axis the 
imaginary axis of the complex number plane. N ote that in electrical engineering 
the convention is j = and i is reserved for a current there. The complex 
numbers )(—1 ± iV3) from Example 6.1.1 are the points ±V3). 


Figure 6.1 


y 


y 



{x,y~) 


x 


O 


X 


Complex 
Plane—Argand 
Diagram 
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A graphical representation is a powerful means to see a complex number or 
variable. By plotting x (the real part of z) as the abscissa and y (the imaginary 
part of z) as the ordinate, we have the complex plane or Argand plane shown 
in Fig. 6.1. If we assign specific values to x and y, then z corresponds to a 
point {x, y) in the plane. In terms of the ordering mentioned previously, it is 
obvious that the point (x, y) does not coincide with the point (y, x) except for 
the special case of x = y. 

Complex numbers are points in the plane, and now we want to add, sub¬ 
tract, multiply, and divide them, just like real numbers. All our complex variable 
analyses can now be developed in terms of ordered pairs 1 of numbers (a, ft), 
variables {x, y), and functions (u(x, y), v(x, yj). 

We now define addition of complex numbers in terms of their Cartesian 
components as 

Zi + z 2 = {x h y x ) + {x 2 , 2 / 2 ) = Cu + x 2 , y x + y 2 ) = z 2 + z h (6.2) 

that is, two-dimensional vector addition. In Chapter 1, the points in the xy- 
plane are identified with the two-dimensional displacement vector r = xx+yy. 
As a result, two-dimensional vector analogs can be developed for much of our 
complex analysis. Exercise 6.1.2 is one simple example; Cauchy’s theorem 
(Section 6.3) is another. Also, —z + z= (—x, —y) + {x, y) = 0 so that the 
negative of a complex number is uniquely specified. Subtraction of complex 
numbers then proceeds as addition: z x - z 2 = (x x — x 2 , y x — y 2 ). 
Multiplication of complex numbers is defined as 

z x z 2 = (x lt y 1 ) ■ (x 2 , 2 / 2 ) = (X 1 X 0 - 2 / 12 / 2 , Xiy -2 + x 2 y 1 ). (6.3) 

Using Eq. (6.3) we verify that i 2 = (0, 1) • (0, 1) = (— 1, 0) = — 1 so that we can 
also identify i = V—'1 as usual, and further rewrite Eq. (6.1) as 

Z = (x, y) = {X, 0) + (0 ,y) = x+ (0, 1) -(y,0) = x+ iy. (6.4) 

Clearly, the i is not necessary here but it is truly convenient and tra¬ 
ditional. It serves to keep pairs in order—somewhat like the unit vectors of 
vector analysis in Chapter 1. 

With complex numbers at our disposal, we can determine the complex 
zeros of z 2 + z+ 1 = 0 in Example 6.1.1 as z = — 1 ± so that 

^ + Z+ 1 = { Z+ l~l V ^){ Z+ l + l^) 

factorizes completely. 


Complex Conjugation 


The operation of replacing i by -i is called “taking the complex conjugate.” 
The complex conjugate of z is denoted by z*; 2 where 


z* = x — iy. 


^his is precisely how a computer does complex arithmetic. 

2 The complex conjugate is often denoted by z in the mathematical literature. 


(6.5) 
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Figure 6.2 


Complex Conjugate 
Points 


% 



(x, y ) 


X 


(x, - y ) 


The complex variable z and its complex conjugate Z' are mirror images of each 
other reflected in the .x-axis; that is, inversion of the t/-axis (compare Fig. 6.2). 
The product zz’' leads to 


22 * = (x + iy)(x — iy) = x 2 + y 2 = r 2 . 


( 6 . 6 ) 


Hence, 


(220 1/2 = \Z\ 


is defined as the magnitude or modulus of z. 

Division of complex numbers is most easily accomplished by replacing 
the denominator by a positive number as follows: 

zi xi + iyi (xi + iyi)(x2 - iyi) {xix 2 + 2/12/2, ^22/1 - xyyi) „ 


, (6.7) 



x 2 + iy-2 (mi + iy2)(x 2 - iyi) 


which displays its real and imaginary parts as ratios of real numbers with 
the same positive denominator. Here, \z 2 \ 2 = x 2 + y\ is the absolute value 
(squared) of z 2 , and z* 2 = x 2 — iy 2 is called the complex conjugate of z 2 . 
We write 2 : 2 1 2 = z^z 2 , which is the squared length of the associated Cartesian 
vector in the complex plane. 

Furthermore, from Fig. 6.1 we may write in plane polar coordinates 


x = r cosd, y = rsmQ 


( 6 . 8 ) 


and 


z = r( cos 6 + i sin 0). 


(6.9) 


In this representation r is the modulus or magnitude of 


z (r = \z\ = (x 2 + y 2 ) 1/2 ) 
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and the angle 9 (= I an -1 (y/x j) is labeled the argument or phase of z. Using 
a result that is suggested (but not rigorously proved) 3 by Section 5.6, we have 
the very useful polar representation 

z — re lB . (6.10) 

In order to prove this identity, we use i 3 = —i, v' 4 = 1, etc. in the Taylor 
expansion of the exponential and trigonometric functions and separate even 
and odd powers in 


For the special values 6 = jt/2, and 9 = jr, we obtain 

e m/2 = cos(jt/ 2) + isin(7r/2) = i, e m = cos (jr) = — 1, 

intriguing connections between e, i, and it. Moreover, the exponential func¬ 
tion e lB is periodic with period 27r, just like sin 9 and cos 0. As an immediate 
application we can derive the trigonometric addition rules from 

cos(<9i + 0 2 ) + i sin(0i + (9 2 ) = e m+B2) 

= e^e 192 — [cos 0i + i sin0!][cos0 2 + isin0 2 ] 

= cos 0i cos 02 — sin 9\ sin 0 2 

+ i(sin 0i cos 0 2 + sin 0 2 cos 6{). 

Let us now convert a ratio of complex numbers to polar form explicitly. 


m 


,2u+l 


y m n y ( ie ^ V y' 

^^T-^W + ^(2r+l)! 

oo rflv OO o2v+l 

V(-l Y— - +iV(-l)“- 

^ (2v)! (2v +1)! 


= cos0+isin0. (6.11) 


v=0 


EXAMPLE 6.1.2 


Conversion to Polar Form 

ratio to a real number: 


We start by converting the denominator of a 


2 + i _ (2 + i)(3 + 2i) _ 6 - 2 + i(3 + 4) _ 4 + 7i _ f~5 W(i 
3-2 i~ 3 2 + 2 2 _ 13 13 V 13® ’ 


where = w = T 5 anf l l an 0o = j ■ Because arctan(0) has two branches in 
the range from zero to 27r, we pick the solution 9 0 = 60.255°, 0 < 0 0 < it /2, 
because the second solution 0 () + tc gives e'- (0 « +7T ) = —e l0( ' (i.e., the wrong sign). 

Alternately, we can convert 2 + i = V5e la and 3 — 2 i = \/l3e^ to polar 
form with tan a = |, tan /3 = — | and then divide them to get 

2 + 1 = 

3-2 i V 13 


3 Strictly speaking, Chapter 5 was limited to real variables. However, we can define e z as o ^"1 rt - 

for complex z. The development of power-series expansions for complex functions is taken up in 
Section 6.5 (Laurent expansion). 
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EXAMPLE 6.1.3 


The choice of polar representation [Eq. (6.10)] or Cartesian representation 
[Eqs. (6.1) and (6.4)] is a matter of convenience. Addition and subtraction 
of complex variables are easier in the Cartesian representation [Eq. (6.2)]. 
Multiplication, division, powers, and roots are easier to handle in polar form 
[Eqs. (6.8)—(6.10)]. 

Let us examine the geometric meaning of multiplying a function by a com¬ 
plex constant. 

Multiplication by a Complex Number When we multiply the complex 
variable z by i = e w/2 , for example, it is rotated counterclockwise by 90° to 
iz = ix — y = (—y, x). When we multiply z = re 10 by e w , we get re ll - n+a) , which 
is z rotated by the angle a. 

Similarly, curves defined by / (z) = const, are rotated when we multiply a 
function by a complex constant. When we set 

f(z) = (pc + iyf = (pc 2 — y 2 ) + 2 ixy = c = ci + ic 2 = const., 

we define two hyperbolas 

x 2 — y 2 = ci, 2 xy = C 2 . 

Upon multiplying f(z ) = c by a complex number Ae m , we obtain 

Ae‘ a f(z) — A(coso! + isinaXa; 2 — y 2 + 2 ixy) 

= A[i(2xy cos a + ( x 2 — ?/ 2 )sina) — 2xysinoi + ( x 2 — 2 / 2 )cosa]. 
The hyperbolas are scaled by the modulus A and rotated by the angle a. ■ 

Analytically or graphically, using the vector analogy, we may show that the 
modulus of the sum of two complex numbers is no greater than the sum of the 
moduli and no less than the difference (Exercise 6.1.2): 

|2ll - |Z2l < \Zl+Z 2 \ < IZil + \Z2\- (6.12) 

Because of the vector analogy, these are called the triangle inequalities. 

Using the polar form [Eq. (6.8)] we find that the magnitude of a product is 
the product of the magnitudes, 

\Z\ ■ Z 2 \ = \zi\ ■ \z 2 \. (6.13) 

Also, 

arg (z\ ■ z 2 ) = arg z t + arg z 2 . (6.14) 

From our complex variable z complex functions f(z ) or w (z) may be con¬ 
structed. These complex functions may then be resolved into real and imagi¬ 
nary parts 


w(z) = u(x, y) + iv(x, y), 


(6.15) 
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Figure 6.3 

The Function w (z) = 
u(x, y ) + iv(x, y ) Maps 
Points in the xy -Plane 
into Points in the 
Mr-Plane 



in which the separate functions u(x, y ) and v(x, y ) are pure real. For example, 
if f(z ) = z 2 , we have 

/O) = (x + iyf = x 2 - y 2 + 2 ixy. 

The real part of a function f(z) will be labeled whereas the imag¬ 

inary part will be labeled 3/(2). In Eq. (6.15), 

91 w(z) = u(x, y), 3u>(z) = v(pc, y). (6.16) 

The relationship between the independent variable z and the dependent vari¬ 
able w is perhaps best pictured as a mapping operation. A given z = x + iy 
means a given point in the 2 -plane. The complex value of w (z) is then a point 
in the u;-plane. Points in the 2 -plane map into points in the w-plane, and curves 
in the 2 -plane map into curves in the m;- plane, as indicated in Fig. 6.3. 


Functions of a Complex Variable 


All the elementary (real) functions of a real variable may be extended into 
the complex plane, replacing the real variable x by the complex variable 2 . 
This is an example of the analytic continuation mentioned in Section 6.5. The 
extremely important relations, Eqs. (6.4), (6.8), and (6.9), are illustrations. 
Moving into the complex plane opens up new opportunities for analysis. 


EXAMPLE 6.1.4 


De Moivre’s Formula If Eq. (6.11) is raised to the nth power, we have 

e M = (cos 9 + i sin 6») m . (6.17) 

Using Eq. (6.11) now with argument nO, we obtain 

cos nd + i sin nO = (cos 6 + i sin 6 ) n . (6.18) 


This is De Moivre’s formula. 
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Now if the right-hand side of Eq. (6.18) is expanded by the binomial theo¬ 
rem, we obtain cos n9 as a series of powers of cos 9 and sin 6 (Exercise 6.1.5). 
Numerous other examples of relations among the exponential, hyperbolic, and 
trigonometric functions in the complex plane appear in the exercises. ■ 


Occasionally, there are complications. Taking the nth root of a complex 
number z = re 16 gives z ],ln = r ] l n e'"/ n . This root is not the only solution, 
though, because z = re l(0+2m7T) for any integer m yields n — 1 additional roots 
z l/n _ r i/n e ie/n+ 2 mm/n p or m= 2,..., n — 1. Therefore, taking the nth root 
is a multivalued function or operation with n values, for a given complex 
number z. Let us look at a numerical example. 


EXAMPLE 6.1.5 


Square Root When we take the square root of a complex number of argu¬ 
ment 9 we get 9/2. Starting from —1, which is r = 1 at 9 = 180°, we end 
up with r = 1 at 9 = 90°, which is i, or we get 0 = —90°, which is —i upon 
taking the root of — 1 = e~ m . Here is a more complicated ratio of complex 
numbers: 


4 + 2 i 


(3 - i)(4 - 2 i) 
4 2 + 2 2 


12-2-i(4+6) 1 

20 V 2 1 ^ 


In/i-2nn) _ /S+imr _ ^ r ~ijr/8 

V 2 2 1 / 4 2 1 / 4 


for n = 0, 1. 


Another example is the logarithm of a complex variable z that may be 
expanded using the polar representation 

In z = In re lB = In r + id. (6.19) 

Again, this is not complete due to the multiple branches of the inverse tangent 
function. To the phase angle, 9 , we may add any integral multiple of 2n without 
changing z due to the period 2it of the tangent function. Hence, Eq. (6.19) 
should read 


In « = lnre' l< - 0+2niT) = lnr + i(9 + 2tott). (6.20) 

The parameter n may be any integer. This means that In z is a multivalued 
function having an infinite number of values for a single pair of real values r 
and 9. To avoid ambiguity, we usually agree to set n— 0 and limit the phase 
to an interval of length 2tc such as (— n, 7r). 4 The line in the 2 '-plane that is not 
crossed, the negative real axis in this case, is labeled a cut line. The value 
of In z with n = 0 is called the principal value of In z. Further discussion of 
these functions, including the logarithm, appears in Section 6.6. 


4 There is no standard choice of phase: The appropriate phase depends on each problem. 
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Figure 6.4 

Electric RLC Circuit 
with Alternating 
Current 



EXAMPLE 6.1.6 


Electric Circuits An electric circuit with a current I flowing through a 
resistor and driven by a voltage V is governed by Ohm’s law, V = IR, where 
R is the resistance. If the resistance is replaced by an inductance L, then the 
voltage and current are related by V = L^j. If the inductance is replaced by 
the capacitance C, then the voltage depends on the charge Q of the capacitor: 
V =%. Taking the time derivative yields ^ = I. Therefore, a circuit 

with a resistor, a coil, and a capacitor in series (see Fig. 6.4) obeys the ordinary 
differential equation 



Q 

C 


+ IR = V = V o cos cot 


( 6 . 21 ) 


if it is driven by an alternating voltage with frequency co. In electrical engineer¬ 
ing it is a convention and tradition to use the complex voltage V = Voe lm/ and 
a current I = Re"’ 11 of the same form, which is the steady-state solution of 
Eq. (6.21). This complex form will make the phase difference between current 
and voltage manifest. At the end, the physically observed values are taken to be 
the real parts (i.e., Vo cos cot = V 0 ;He""', etc.). If we substitute the exponential 
time dependence, use cil/dt = icol, and integrate 7 once to get Q = I/ico in 
Eq. (6.21), we find the following complex form of Ohm’s law: 

icoLI + — + RI =V = ZI. 
icoC 


We define Z = R+i(coL — ^ ) as the impedance, a complex number, obtaining 
V = ZI, as shown. 

More complicated electric circuits can now be constructed using the impe¬ 
dance alone—that is, without solving Eq. (6.21) anymore—according to the 
following combination rules: 

• The resistance R of two resistors in series is R — Ri + R-z. 

• The inductance L of two inductors in series is 7 = 7,+ L 2 . 

• The resistance R of two parallel resistors obeys ^ ^ ^. 

• The inductance L of two parallel inductors obeys j- = 

• The capacitance of two capacitors in series obeys ^ ^ 

• The capacitance of two parallel capacitors obeys C — C\ + C 2 . 
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In complex form these rules can be stated in a more compact form as follows: 


SUMMARY 


• Two impedances in series combine as Z = Z\ + Z 2 ; 

• Two parallel impedances combine as ^ = jy + jr r 

Complex numbers extend the real number axis to the complex number plane 
so that any polynomial can be completely factored. Complex numbers add and 
subtract like two-dimensional vectors in Cartesian coordinates: 


Zi + z 2 = (x\ + iy\) + (x 2 + iyi) = Xy + X 2 + f(?/i + y 2 \ 

They are best multiplied or divided in polar coordinates of the complex plane: 

(aci + m/i )(>2 + iy 2 ) = r l e lBl r 2 e il>2 = nr 2 e m+e ^, rj = x* + tfj, tan Bj = yj/xj. 

The complex exponential function is given by e z = e r '(cos y+i sin y). For 
z = x+iO = x, e z = e x . The trigonometric functions become 

cos z = 1 (e lz + e~ lz ) — cos x cosh y — i sin :r sinh y, 

sin z = ^ (e lz — e~ lz ) — sin x cosh y + i cos .t' sinh y. 

The hyperbolic functions become 

cosh z= -(e z + e~ z ) = cos iz, sinh z= - (e z — e~ z ) = — i sin iz. 

Li Lu 

The natural logarithm generalizes to in 2 ’ = In \z\ + i(6 + 2 jvri), n— 0, ±1, ... 
and general powers are defined as z p — e plnz . 


EXERCISES 

6.1.1 (a) Find the reciprocal of x + iy, working entirely in the Cartesian 

representation. 

(b) Repeat part (a), working in polar form but expressing the final result 
in Cartesian form. 

6.1.2 Prove algebraically that 

|Zll - \Z2\ < \Zl +Z 2 \ < \Zi\ + \Z 2 \. 

Interpret this result in terms of vectors. Prove that 

\z— 1| < |>/ z 2 — 11 < | z+ 1|, for SW( 2 ) > 0. 

6.1.3 We may define a complex conjugation operator K such that Kz — z*. 
Show that K is not a linear operator. 

6.1.4 Show that complex numbers have square roots and that the square 
roots are contained in the complex plane. What are the square roots 
of i? 
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6.1.5 Show that 

(a) cos nd = cos”@ — (”) cos” -2 6 sin 2 0 + (”) cos” -4 6 sin 4 

(b) sinn0 = (”) cos ” -1 0 sind — (”) cos” -3 6 sin 3 9 + • • •. 

Note. The quantities (”) are the binomial coefficients (Chapter 5): 
(”) = n\/[(n— m)!m!]. 


6.1.6 Show that 

N -1 


^ sm(Nx/Z) x 

(a) > cos nx = -cos(iV — 1)-, 

' sina;/2 2 

s\n(Nx/Z) x 


n =0 
N -1 

(b) E sin = 


n=0 

Hint. Parts (a) and (b) may be combined to form a geometric series 
(compare Section 5.1). 

6.1.7 For —1 < p < 1, show that 

1 — pcosx 


(a) 


cos nx — 


n =0 


(b) ^ p" sin nx = — 


71 =0 


1 — 2pcosx+ p 2 ’ 
p sin a; 

— 2pcosa: + p 


,2 ’ 


These series occur in the theory of the Fabry-Perot interferometer. 

6.1.8 Assume that the trigonometric functions and the hyperbolic functions 
are defined for complex argument by the appropriate power series 


s” 


sins 


cos s 


= £ = £(-u 


y 2s+l 


n=l,odd 


s=0 


(2s+1)!’ 


s” 


= y (-i)” /2 -r = E(- 1 ) ; 

n\ ^ 


J2s 


n= 0, even 




sinhs = V —- = Y 


s=0 

v2s+1 


(2s)!’ 


“ (2s + 1)! 


n=l,odd * s=0 

oo n oo ^2s 

cosh z = y — = y 


n= 0, even 


zr 

3!^' 


s=0 


(a) Show that 


i sin s = sinh is, sin iz=i sinh s, 
cos z = cosh iz, cos iz — cosh z. 


(b) Verify that familiar functional relations such as 

“h C~ z 

coshs = ---, sin(si + s 2 ) = sin z\ cos s 2 + sins 2 cos Z\ 


still hold in the complex plane. 
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6.1.9 Using the identities 

e iz -|- e ~ iz e iz _ e ~ iz 
cos 2=-2-> sinz = -2i-’ 

established from comparison of power series, show that 

(a) sin (a; + iy) = sin x cosh y+i cos x sinh y, 
cos(.x + iy) = cos x cosh y — i sin x sinh y, 

(b) | sin z\ 2 — sin 2 x + sinh 2 y, | cos z\ 2 — cos 2 x + sinh 2 y. 

This demonstrates that we may have | sin z\ , | cos z\ > 1 in the complex 
plane. 

6.1.10 From the identities in Exercises 6.1.8 and 6.1.9, show that 

(a) sinh(.x; + iy) = sinh x cos y + i cosh x sin y, 
cosh(.x + iy) = cosh x cos y+i sinh x sin y, 

(b) | sinh z\ 2 = sinh 2 x + sin 2 y, | cosh z\ 2 = sinh 2 x + cos 2 y. 


6.1.11 Prove that 

(a) Isin^l > | sin x \, (b) |coss| > |cosa;|. 

6.1.12 Show that the exponential function e z is periodic with a pure imaginary 
period of 2ni. 


6.1.13 Show that 


(a) tanh(;s/2) = 


sinh x + i sin y 
cosh x + cos y’ 


6.1.14 Find all the zeros of 

(a) sin z, (b) cos z, (c) sinh z, 


(b) coth(«/2) = 


sinh x — i sin y 
cosh x — cos y 


(d) cosh z. 


6.1.15 Show that 

(a) shT 1 z — —i ln(i 2 ; ± -J\ — z 2 ), (d) sinh -1 z — ln(s + ^/z 2 + 1), 

(b) cos -1 s = —i ln(s ± J z 2 — 1), (e) cosh -1 z — ln(s + j z 2 — 1), 

(c) tan -1 z=i In f —V (f) tanh -1 z=\ In f Y 

2 \i-zj 2 \l-zj 

Hint. 1. Express the trigonometric and hyperbolic functions in terms 
of exponentials. 2. Solve for the exponential and then for the exponent. 
Note that sin -1 z = arcsin z ^ (sins) -1 , etc. 


6.1.16 A plane wave of light of angular frequency a> is represented by 

gi(o(t—rix/c) 


In a certain substance the simple real index of refraction n is replaced 
by the complex quantity n—ik. What is the effect of k on the wave? What 
does k correspond to physically? The generalization of a quantity from 
real to complex form occurs frequently in physics. Examples range 
from the complex Young’s modulus of viscoelastic materials to the 
complex (optical) potential of the “cloudy crystal ball” model of the 
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atomic nucleus. See the chapter on the optical model in M. A. Preston, 
Structure of the Nucleus. Addison-Wesley, Reading, MA (1993). 


6.1.17 A damped simple harmonic oscillator is driven by the complex external 
force Fe"" 1 . Show that the steady-state amplitude is given by 


to(&> I — co 2 ) + icob 

Explain the resonance condition and relate m, co o, b to the oscillator 
parameters. 

Hint. Find a complex solution z(t) = Ae 1 "' 1 of the ordinary differential 
equation. 


6.1.18 We see that for the angular momentum components defined in Exercise 
2.5.10, 


L x — iLy ^ ( L x + iLyf. 

Explain why this happens. 


6.1.19 Show that the phase of f(z) = u + iv is equal to the imaginary part of 
the logarithm of f(z). Exercise 10.2.13 depends on this result. 

6.1.20 (a) Show that e ]nz always equals z. 

(b) Show that In e z does not always equal z. 


6.1.21 Verify the consistency of the combination rules of impedances with 
those of resistances, inductances, and capacitances by considering 
circuits with resistors only, etc. Derive the combination rules from 
Kirchhoff’s laws. Describe the origin of Kirchhoffs laws. 


6.1.22 Show that negative numbers have logarithms in the complex plane. In 
particular, find ln(—1). 


ANS. ln(-l) = in. 


6.2 Cauchy-Riemann Conditions 


Having established complex functions of a complex variable, we now proceed 
to differentiate them. The derivative of f(z) = u(x, y) + iv(x, y), like that of a 
real function, is defined by 


lim 

3z^0 


f(z+Sz)- f(z) 
z+ Sz— z 


lim 

3z^0 


sm 

Sz 


f = 

dz 


( 6 . 22 ) 


provided that the limit is independent of the particular approach to the point z. 
For real variables we require that the right-hand limit (x —»■ xq from above) and 
the left-hand limit (x xi } from below) be equal for the derivative df{x)/dx 
to exist at x — Xq. Now, with z (or zq) some point in a plane, our requirement 
that the limit be independent of the direction of approach is very restrictive. 
Consider increments Sx and Sy of the variables x and y, respectively. Then 


Sz = Sx + iSy. 


(6.23) 
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Figure 6.5 

Alternate 
Approaches to zq 



Also, 


so that 


Sf = 8u+ i8v 

(6.24) 

8f 8u + i8v 

(6.25) 

8z 8x+i8y 


Let us take the limit indicated by Eq. (6.23) by two different approaches as 
shown in Fig. 6.5. First, with 8y = 0, we let Sx -» 0. Equation (6.24) yields 


Sf 

lim — = lim 

8z— >-0 Sz 8 x ^-0 


Su ,8v 

- “I” i - 

8x 8x 


du . dv 
— 4" t— , 
dx dx 


(6.26) 


assuming the partial derivatives exist. For a second approach, we set Sx = 0 
and then let 5?/ —0. This leads to 


Sf 

lim — = lim 

8z—>0 8z 8y—> 0 





dv 
3 V 


(6.27) 


If we are to have a derivative df/dz, Eqs. (6.26) and (6.27) must be identical. 
Equating real parts to real parts and imaginary parts to imaginary parts (like 
components of vectors), we obtain 


3 U dv du dv 

dx dy dy dx' 


(6.28) 


These are the famous Cauchy-Riemann conditions. They were discovered 
by Cauchy and used extensively by Riemann in his theory of analytic func¬ 
tions. These Cauchy-Riemann conditions are necessary for the existence of 
a derivative of f(z); that is, if df/dz exists, the Cauchy-Riemann conditions 
must hold. They may be interpreted geometrically as follows. Let us write 
them as a product of ratios of partial derivatives 

Ux V x _ ^ 

Uy Vy 

with the abbreviations 

du du dv dv 


(6.29) 
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Figure 6.6 

Orthogonal Tangents 
to u(x, y) = const. 
v(x, y) — const. 
Lines 



Now recall the geometric meaning of —u x /u y as the slope of the tangent [see 
Eq. (1.54)] of each curve u(x, y) = const, and similarly for v(x, y) — const. 
(Fig. 6.6). Thus, Eq. (6.29) means that the u — const, and v = const, curves are 
mutually orthogonal at each intersection because sin ft = sin(o! + 90°) = cos a 
and cos ft = — sin a imply tan ft ■ tan a = — 1 by taking the ratio. Alternatively, 


u x dx + Uydy = 0 — v y dx — v x dy 


states that if (dx, dy) is tangent to the w-curve, then the orthogonal (—dy, dx) 
is tangent to the u-curve at the intersection point z = (x, y). Equivalently, 
u x v x + u y v y — 0 implies that the gradient vectors (u x , u y ) and (v x , v y ) are 
perpendicular. Conversely, if the Cauchy-Riemann conditions are satisfied 
and the partial derivatives of u(x, y) and v(x, y) are continuous, the derivative 
df/dz exists. This may be shown by writing 


Sf = 


ft dll ,3l>\ 
\3x 1 dx) 


&x + 


ftdu ,dv\ 
+ ? 3 y) 


dy. 


(6.30) 


The justification for this expression depends on the continuity of the partial 
derivatives of u and v. Dividing by Sz, we have 

Sf (du/dx+ i(dv/dx))8x+ (du/dy+ i(du/dy))Sy 

Sz Sx + iSy 

(du/dx+ i(dv/dx)) + (du/dy + i(dv/dy))8y/Sx 

= - 7 - , . -• (6.31) 

1 + i(8y/8x) 

If Sf/8z is to have a unique value, the dependence on 8y/8x must be elimi¬ 
nated. Applying the Cauchy-Riemann conditions to the y derivatives, we obtain 


du .dv dv .du 

— -b i — =- i —. 

dy dy dx dx 


(6.32) 


Substituting Eq. (6.32) into Eq. (6.30), we may rewrite the Sy and Sx depen¬ 
dence as 8z= 8x + i8y and obtain 


8f du .dv 
— == — -f- i —, 

8z dx dx 

which shows that lim Sf/Sz is independent of the direction of approach in the 
complex plane as long as the partial derivatives are continuous. 
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It is worthwhile to note that the Cauchy-Riemann conditions guarantee 
that the curves u — Ci — constant will be orthogonal to the curves v = c<i = 
constant (compare Section 2.1). This property is fundamental in application to 
potential problems in a variety of areas of physics. If u = C\ is a line of electric 
force, then v = C2 is an equipotential line (surface) and vice versa. Also, it is 
easy to show from Eq. (6.28) that both u and v satisfy Laplace’s equation. A 
further implication for potential theory is developed in Exercise 6.2.1. 

We have already generalized the elementary functions to the complex plane 
by replacing the real variable x by complex z. Let us now check that their 
derivatives are the familiar ones. 

Derivatives of Elementary Functions We define the elementary functions 
by their Taylor expansions (see Section 5.6, with x -» z, and Section 6.5) 






We differentiate termwise [which is justified by absolute convergence for 
e z , cos z, sin z for all z and for ln(l + z) for \z\ < 1] and see that 



—z = lim - 

dz 8 z 

— lim [z 11 + nz n ~ l 8 z + • • • + (Ss)” — z n ]/ 8 z — nz n ~ l 






that is, the real derivative results all generalize to the complex field, simply 
replacing x —> z. ■ 
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Biographical Data 

Riemann, Bernhard Georg Friedrich. Riemann, a German mathemati¬ 
cian, was bom in 1826 in Hannover and died of tuberculosis in 1866 in 
Selasca, Italy. Son of a Lutheran pastor, he changed from studying the¬ 
ology to mathematics at the University of Gottingen where, in 1851, his 
Ph.D. thesis was approved by Gauss. He contributed to many branches of 
mathematics despite dying before the age of 40, the most famous being 
the development of metric (curved) spaces from their intrinsic geometric 
properties such as curvature. This topic was the subject of his Habilita- 
tion thesis, or venia legendi, which Gauss attended and deeply impressed 
him. Half a century later Riemannian geometry would become the basis for 
Einstein’s General Relativity. Riemann’s profound analysis of the complex 
zeta function laid the foundations for the first proof of the prime number 
theorem in 1898 by French mathematicians J. Hadamard and C. de la Vallee- 
Poussin and other significant advances in the theory of analytic functions 
of one complex variable. His hypothesis about the distribution of the non¬ 
trivial zeros of the zeta function, with many consequences in analytic prime 
number theory, remains the most famous unsolved problem in mathematics 
today. 


Analytic Functions 


Finally, if f(z') is differentiable at z = zq and in some small region around zo, 
we say that f(z) is analytic 6 at z = Z(,. If f(z) is analytic everywhere in the 
(finite) complex plane, we call it an entire function. Our theory of complex 
variables is one of analytic functions of a complex variable, which indicates 
the crucial importance of the Cauchy-Riemann conditions. The concept of 
analyticity used in advanced theories of modern physics plays a crucial role in 
dispersion theory (of elementary particles or light). If f(z) does not exist at 
z = Zo, then Zq is labeled a singular point and consideration of it is postponed 
until Section 7.1. 

To illustrate the Cauchy-Riemann conditions, consider two very simple 
examples. 


EXAMPLE 6.2.2 


Let f(z) — z 2 . Then the real part u(x, y) = x 2 — y 2 and the imaginary part 
v(x, y) = 2 xy. Following Eq. (6.28), 


3 u ^ 3r 3 u ^ 9 v 

dx X 3 y dy ^ dx' 

We see that / (z) = z 2 satisfies the Cauchy-Riemann conditions throughout 
the complex plane. Since the partial derivatives are clearly continuous, we 
conclude that f(z) = zr is analytic. ■ 


6 Some writers use the terms holomorphic or regular. 
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EXAMPLE 6.2.3 


SUMMARY 


Let f(z) = z*. Now u — x and v = —y. Applying the Cauchy-Riemann condi¬ 
tions, we obtain 

du dv 

— = 1, whereas — = —1. 

dx d y 

The Cauchy-Riemann conditions are not satisfied and f(z) = z* is not an 
analytic function of z. It is interesting to note that f(z) — zf is continuous, 
thus providing an example of a function that is everywhere continuous but 
nowhere differentiable. ■ 

The derivative of a real function of a real variable is essentially a local char¬ 
acteristic in that it provides information about the function only in a local 
neighborhood—for instance, as a truncated Taylor expansion. The existence 
of a derivative of a function of a complex variable has much more far-reaching 
implications. The real and imaginary parts of analytic functions must sepa¬ 
rately satisfy Laplace’s equation. This is Exercise 6.2.1. Furthermore, an ana¬ 
lytic function is guaranteed derivatives of all orders (Section 6.4). In this sense 
the derivative not only governs the local behavior of the complex function but 
also controls the distant behavior. 


EXERCISES 


6.2.1 The functions u(x, y) and v(x, y) are the real and imaginary parts, re¬ 
spectively, of an analytic function w(z). 

(a) Assuming that the required derivatives exist, show that 

V 2 u= V 2 v = 0. 


Solutions of Laplace’s equation, such as u(x, y) and v(x, y), are called 
harmonic functions. 

(b) Show that 

du du dv dv 

- 1 -= 0 

dx dy dx dy 

and give a geometric interpretation. 

Hint. The technique of Section 1.5 allows you to construct vectors nor¬ 
mal to the curve u(x, y) — c-, and v(x, y) = Cj. 

6.2.2 Show whether or not the function f(z) = ;H (z) = x is analytic. 

6.2.3 Having shown that the real part u(x, y) and the imaginary part v(x, y) 
of an analytic function w(z) each satisfy Laplace’s equation, show that 
u(x, y) and v(x, y) cannot both have either a maximum or a mini¬ 
mum in the interior of any region in which w(z) is analytic. (They can 
have saddle points; see Section 7.3.) 

6.2.4 Let A = d 2 w/dx 2 , B = d 2 w/dxdy, C = d 2 w/dy 2 . From the calculus of 
functions of two variables, w (x, y), we have a saddle point if 

B 2 - AC > 0. 
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With f(z) = u(x , y) + iv(x, y), apply the Cauchy-Riemann conditions 
and show that both u(x, y) and v(pc, y) do not have a maximum or a 
minimum in a finite region of the complex plane. (See also Section 7.3.) 

6.2.5 Find the analytic function 

w(z) = u(pc, y) + iv(x, y) 

if (a) u(pc, y) = x 3 — 3 xy 2 , (b) v(pc, y) = e~ y sinx. 

6.2.6 If there is some common region in which w i = u(x, y) + iv(x, y) and 
wz = w\ = u(x, y) — iv(x, y) are both analytic, prove that u(x, y) and 
v(x, y) are constants. 

6.2.7 The function f(z ) = u(x, y) + iv(x, y) is analytic. Show that f is 
also analytic. 

6.2.8 A proof of the Schwarz inequality (Section 9.4) involves minimizing an 
expression 

f = V'o a + ab + ^bb > 0. 

The i jr are integrals of products of functions; (/„„ and i/r bb are real, \j/ ah is 
complex, and X is a complex parameter. 

(a) Differentiate the preceding expression with respect to X*, treating X 
as an independent parameter, independent of X*. Show that setting 
the derivative df/dX* equal to zero yields 

A = -Vabt^bb- 

(b) Show that df/dX = 0 leads to the same result. 

(c) Let X = x + iy, X* = x — iy. Set the x and y derivatives equal to zero 
and show that again 

* = Vab/^bb- 

6.2.9 The function f(z ) is analytic. Show that the derivative of f(z) with re¬ 
spect to does not exist unless f(z) is a constant. 

Hint. Use the chain rule and take x — (z + z*)/2, y — (z— z*)/2i. 

Note. This result emphasizes that our analytic function f(z ) is not just 
a complex function of two real variables x and y. It is a function of the 
complex variable x + iy. 


D 


6.3 Cauchy’s Integral Theorem 


Contour Integrals 


With differentiation under control, we turn to integration. The integral of a 
complex variable over a contour in the complex plane may be defined in close 
analogy to the (Riemann) integral of a real function integrated along the real 
.'c-axis and line integrals of vectors in Chapter 1. The contour integral may be 
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defined by 

r z 2 


2,2/2 


rz 2 r ^ 

/ f(z)dz = / [M(a:, ?/) + ii>(;r, ?/)] [da; + i dy] 

Jz 1 •'Xl.2/1 


rX2,y2 

/ [m(;t, ?/)cte — r(;r, y)cJy] + i 

Jzi.yi 


rX2,y2 

i / [i>0r, 2 /)efe + rt(;r, y)dy] 

Jxi,yi 


(6.33) 


with the path joining (x \, v/i) and (x-z , 2 / 2 ) specified. If the path C is parameter¬ 
ized as x (s), 2 /(s), then dx —»■ ^|ds, and dy —> This reduces the com¬ 

plex integral to the complex sum of real integrals. It is somewhat analogous 
to the replacement of a vector integral by the vector sum of scalar integrals 
(Section 1.9). 

We can also proceed by dividing the contour from zq to z' () int o n intervals by 
picking n— 1 intermediate points Z \, Zz , ..., on the contour (Fig. 6.7). Consider 
the sum 

n 

S n = y ' f(£j)(Zj — Zj- 1 ), (6.34a) 

3 =1 

where is a point on the curve between Zj and Zj-\. Now let n—> 00 with 

\Zj 0 

for all j. If the linv-s-oo S n exists and is independent of the details of choosing 
the points Zj and fy as long as they lie on the contour, then 

1 )= Pmdz. (6.34b) 

Jzo 

The right-hand side of Eq. (6.34b) is called the contour integral of f(z) (along 
the specified contour C from z = z<> to z = z' f] ). When we integrate along 


fim V fUjXzj - z f 

n-+ 00 L ' 

j= 1 
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the contour in the opposite direction, dz changes sign and the integral 
changes sign. 

An important example is the following contour integral. 


EXAMPLE 6.3.1 


Cauchy Integral for Powers Let us evaluate the contour integral f„ z" dz, 
where C is a circle of radius r > 0 around the origin z = 0 in the positive 
mathematical sense (counterclockwise). In polar coordinates of Eq. (6.10) we 
parameterize the circle as z = re 10 and dz = ire l 0 d 6 . For integer n ^ — 1, we 
then obtain 


/* /* 2,71 

/ z n dz = r n+1 / iexp[i(n+ Y)0]d0 
Jc Jo 

= (n+ V>- l r n+l [e* n+l)e ]\f =Q = 0 
because 2rt is a period of e+ l+])0 , whereas for n = — 1 


f dz f 2n . 

— = i 

Jc z Jo 


idO — 2 ni, 


(6.35) 


(6.36) 


again independent of r. 

Alternatively, we can integrate around a rectangle with the corners 

z\, Z 2 , z : >„ Z 4 to obtain for n ^ — 1 


/ 


z n dz = 


z n+l 

z 2 gU +1 

z 3 gU +1 

24 z n +1 

n+ 1 

Zl ' n+l 

Z2 ' n+l 

*3 ' n + 1 


because each comer point appears once as an upper and a lower limit that 
cancel. For n = — 1 the corresponding real parts of the logarithms cancel 
similarly, but their imaginary parts involve the increasing arguments of the 
points from z\ to 24 and, when we come back to the first corner z\ its argument 
has increased by 2 jt due to the multivaluedness of the logarithm so that 2 tt i is 
left over as the value of the integral. Thus, the value of the integral involving 
a multivalued function must be that which is reached in a continuous 
fashion on the path being taken. These integrals are examples of Cauchy’s 
integral theorem, which we prove for general functions in the next section. ■ 


Stokes’s Theorem Proof of Cauchy’s Integral Theorem 


Cauchy’s integral theorem is the first of two basic theorems in the theory of 
the behavior of functions of a complex variable. We present a proof under 
relatively restrictive conditions of physics applications—conditions that are 
intolerable to the mathematician developing a beautiful abstract theory but 
that are usually satisfied in physical problems. Cauchy’s theorem states the 
following: 
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Figure 6.8 

A Closed Contour C 
within a Simply 
Connected Region R 



If a function f(z ) is analytic (therefore single-valued) and its partial 
derivatives are continuous throughout some simply connected region 
R, 6 for every closed path C (Fig. 6.8) in R the line integral of f(z) around 
C is zero or 

<j> J(z)dz — 0. (6.37) 


The symbol <f is used to emphasize that the path is closed. S * 7 
In this form the Cauchy integral theorem may be proved by direct appli¬ 
cation of Stokes’s theorem (Section 1.11). With f(z) = u(x, y) + iv(x, y) and 
dz — dx + i dy, 


f(z)dz — ® (u + iv)(dx - 

c Jc 


-idy) 


= ® (udx— v dy) + i (b (v dx + u dy). (6.38) 

Jc Jc 

These two line integrals may be converted to surface integrals by Stokes’s 
theorem, a procedure that is justified if the partial derivatives are continuous 
within C. In applying Stokes’s theorem, note that the final two integrals of 
Eq. (6.38) are real. Using 

V=±V x + f V y , 

Stokes’s (or Green’s) theorem states that (A is area enclosed by C) 


dx dy. 


(6.39) 


S A simply connected region or domain is one in which every closed contour in that region encloses 

only the points contained in it. If a region is not simply coimected, it is called multiply connected. 
As an example of a multiply connected region, consider the z-plane with the interior of the unit 
circle excluded. 

7 Recall that in Section 1.12 such a function /(s), identified as a force, was labeled conservative. 
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For the first integral in the last part of Eq. (6.38), let u = V x and v = —V y . 8 
Then 


(udx — v cLy ) = ® (V x dx + V y dy) 


= f ( d ^- d ^)dxdy=~ [ 

J A \dx By) y J A 


dv 

dx 


—^ dxdy. 

dy) 


(6.40) 


For the second integral on the right side of Eq. (6.38), we let u — V y and v = V x . 
Using Stokes’s theorem again, we obtain 

/ f (du dv\ 

(b (v dx + u dy) = / - \dxdy. (6.41) 

Jc Ja \ dx dy) 

On application of the Cauchy-Riemann conditions that must hold, since f(z ) 
is assumed analytic, each integrand vanishes and 

/ r (dv du\ f (du dv\ 

* f(z)dz = - -1- ) dxdy + i / ( -) dxdy — 0. (6.42) 

/ Ja\ dx dy) J A \ dx dy) 

A consequence of the Cauchy integral theorem is that for analytic functions 
the line integral is a function only of its end points, independent of the path of 
integration, 


pz2 rz\ 

/ f(z)dz = F{z() - F(zi) = - / f(z)dz, (6.43) 

J Z\ J Z<i 

again exactly like the case of a conservative force (Section 1.12). 

In summary, a Cauchy integral around a closed contour § f(z)dz = 0 when 
the function / (z) is analytic in the simply connected region whose boundary 
is the closed path of the integral. The Cauchy integral is a two-dimensional 
analog of line integrals of conservative forces. 


Multiply Connected Regions 


The original statement of our theorem demanded a simply connected region. 
This restriction may easily be relaxed by the creation of a barrier, a contour 
line. Consider the multiply connected region of Fig. 6.9, in which f(z ) is not 
defined for the interior R 1 . Cauchy’s integral theorem is not valid for the contour 
C, as shown, but we can construct a contour C' for which the theorem holds. 
We draw a line from the interior forbidden region R' to the forbidden region 
exterior to R and then run a new contour C', as shown in Fig. 6.10. The new 
contour C' through ABDEFGA never crosses the contour line that converts R 
into a simply connected region. The three-dimensional analog of this technique 


8 In the proof of Stokes’s theorem (Section 1.12), V x and V y are any two functions (with continuous 
partial derivatives). 
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Figure 6.9 

A Closed Contour C 
in a Multiply 
Connected Region 



Figure 6.10 

Conversion of a 
Multiply Connected 
Region into a Simply 
Connected Region 


nt 

/ 

/////' — t x 




was used in Section 1.13 to prove Gauss’s law. By Eq. (6.43), 


( A f(z)dz=- f' 

JG JE 


f (z)dz, 


(6.44) 


with f(z) having been continuous across the contour line and line segments 
DE and GA arbitrarily close together. Then 


f(z)dz= f f(z)dz+ f f(z)dz— 0 
C' Jabd Jefg 


(6.45) 


by Cauchy’s integral theorem, with region R now simply connected. Applying 
Eq. (6.43) once again with ABD —> C\ and EFG —> —C' 2 , we obtain 



f(z)dz , 


(6.46) 


in which C[ and C' 2 are both traversed in the same (counterclockwise) 
direction. 
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Let us emphasize that the contour line here is a matter of mathematical 
convenience to permit the application of Cauchy’s integral theorem. Since / (z) 
is analytic in the annular region, it is necessarily single-valued and continuous 
across any such contour line. 


Biographical Data 

Cauchy, Augustin Louis, Baron. Cauchy, a French mathematician, was 
bom in 1789 in Paris and died in 1857 in Sceaux, France. In 1805, he entered 
the Ecole Polytechnique, where Ampere was one of his teachers. In 1816, he 
replaced Monge in the Academie des Sciences, when Monge was expelled 
for political reasons. The father of modem complex analysis, his most fa¬ 
mous contribution is his integral formula for analytic functions and their 
derivatives, but he also contributed to partial differential equations and the 
ether theory in electrodynamics. 


EXERCISES 


6.3.1 Show that f{z)dz = — f{z)dz. 

6.3.2 Prove that 


I [ f(z)dz 
I Jc 


5: \ f Imax ' L, 


where |/| m ax is the maximum value of | f(z) along the contour C and L 
is the length of the contour. 


6.3.3 Verify that 



z*dz 


depends on the path by evaluating the integral for the two paths shown 
in Fig. 6.11. Recall that / (z) = z* is not an analytic function of z and that 
Cauchy’s integral theorem therefore does not apply. 


6.3.4 Show that 


dz 


= 0 , 


in which the contour C is (i) a circle defined by \z\ = R > 1 and (ii) a 
square with the corners ±2 ± 2 i. 

Hint. Direct use of the Cauchy integral theorem is illegal. Why? The 
integral may be evaluated by transforming to polar coordinates and using 
tables. The preferred technique is the calculus of residues (Section 7.2). 
This yields 0 for R > 1 and 2ni for R < 1. 

6.3.5 Evaluate f Q 1 \z\ 2 dz along a straight line from the origin to 2 + i and on 
a second path along the real axis from the origin to 2 continuing from 2 
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Figure 6.11 
Contour 



to 2 + i parallel to the imaginary axis. Compare with the same integrals 
where \z\ 2 is replaced by z 2 . Discuss why there is path dependence in 
one case but not in the other. 


6.4 Cauchy’s Integral Formula 


As in the preceding section, we consider a function f(z) that is analytic on 
a closed contour C and within the interior region bounded by C. We seek to 
prove the Cauchy integral formula, 


1 

27ti 


m 

Z~Zq 


dz = f(zo ), 


(6.47) 


in which zq is some point in the interior region bounded by C. This is the 
second of the two basic theorems. Note that since z is on the contour C while 
zo is in the interior, z — Zq ^ 0 and the integral Eq. (6.47) is well defined. 
Looking at the integrand we realize that although f(z) is analytic within 
C, the denominator vanishes at z = Z(,. If /(2b) / 0 and zo lies inside C, the 
integrand is singular, and this singularity is defined as a first-order or 
simple pole. The presence of the pole is essential for Cauchy’s formula to hold 
and in the n = — 1 case of Example 6.3.1 as well. If the contour is deformed as 
shown in Fig. 6.12 (or Fig. 6.10, Section 6.3), Cauchy’s integral theorem applies. 
By Eq. (6.46), 


m 


dz — 


m 


dz = 0, 


C Z-Zo 


C 2 Z-Zo 


(6.48) 
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Figure 6.12 


Exclusion of a 
Singular Point 


y 



Contour line 


x 


where C is the original outer contour and C 2 is the circle surrounding the point 
Zq traversed in a counterclockwise direction. Let z = Zo+re lB , using the polar 
representation because of the circular shape of the path around zq. Here, r is 
small and will eventually be made to approach zero. We have 



Taking the limit as r —> 0, we obtain 



(6.49) 


since f(z) is analytic and therefore continuous at z = zq. This proves the 
Cauchy integral formula [Eq. (6.47)]. 

Here is a remarkable result. The value of an analytic function / (z) is given 
at an interior point z = Zq once the values on the boundary C are specified. This 
is closely analogous to a two-dimensional form of Gauss’s law (Section 1.13) 
in which the magnitude of an interior line charge would be given in terms of 
the cylindrical surface integral of the electric field E. A further analogy is the 
determination of a function in real space by an integral of the function and 
the corresponding Green’s function (and their derivatives) over the bounding 
surface. Kirchhoff diffraction theory is an example of this. 

It has been emphasized that zq is an interior point. What happens if zo is 
exterior to C? In this case, the entire integrand is analytic on and within C. 
Cauchy’s integral theorem (Section 6.3) applies and the integral vanishes. We 
have 



f(z 0 ), z 0 interior 
0, zo exterior. 
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Derivatives 


Cauchy’s integral formula may be used to obtain an expression for the deriva¬ 
tive of f(z). From Eq. (6.47), with f(z) analytic, 


/(zo + ^o) ~ /(zo) = 1 / f /(z) d ~ / /(z) d \ 

Sz 0 2ni8zp \J z - z 0 - 8z 0 J z-z 0 ) 

Then, by definition of derivative [Eq. (6.22)], 


/(* o) = 


linr - 

Szo^O 2ni8Zp 


SzpfCz ) 

(z - Zo - <52 o)(z - « 0 ) 


dz 


1 

27ti 


m 

(z - Zp) 2 


dz. 


(6.50) 


This result could have been obtained by differentiating Eq. (6.47) under the 
integral sign with respect to zp. This formal or tuming-the-crank approach is 
valid, but the justification for it is contained in the preceding analysis. Again, the 
integrand / (z)/(z—zp) 2 is singular at z = Zp if f(zp) ^ 0, and this singularity 
is defined to be a second-order pole. 

This technique for constructing derivatives may be repeated. We write 
f'(zp + 8zp) and f'(zp) using Eq. (6.50). Subtracting, dividing by 8zp, and finally 
taking the limit as 8 zp —> 0, we have 


/ (2) (zo) = 


2 

2ni 


f (z)dz 
(z - Zo) 3 ' 


Note that f^ 2 \zp) is independent of the direction of 8zp as it must be. If / (zp) ^ 
0, then f(z)/(z— Zp~) 3 has a singularity, which is defined to be a third-order 
pole. Continuing, we get 9 


/ W (z 0 ) = 


n\ 
2tt i 


f (z)dz 
(z - z 0 )" +1 ' 


(6.51) 


that is, the requirement that f(z ) be analytic guarantees not only a first deriva¬ 
tive but also derivatives of all orders. Note that the integrand has a pole 
of order n + 1 at z = Zp if j'(zp) ^ 0. The derivatives of f(z) are automati¬ 
cally analytic. Notice that this statement assumes the Goursat version of the 
Cauchy integral theorem [assuming /'(z) exists but need not be assumed to 
be continuous; for a proof, see 5th ed. of Arfken and Weber’s Math. Methods]. 
This is also why Goursat’s contribution is so significant in the development of 
the theory of complex variables. 



Morera’s Theorem 

A further application of Cauchy’s integral formula is in the proof of Morera’s 
theorem, which is the converse of Cauchy’s integral theorem. The theorem 
states the following: 


B This expression is the starting point for defining derivatives of fractional order. See Erdelyi, A. 
(Ed.) (1954). Tables of Integral Transforms, Vol. 2. McGraw-Hill, New York. For recent applications 
to mathematical analysis, see Osier, T. J. (1972). An integral analogue of Taylor’s series and its use 
in computing Fourier transforms. Math. Comput. 26, 449. 
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If a function f(z) is continuous in a simply connected region R and 
§ c f(z) dz= 0 for every closed contour C within R, then f(z) is analytic 
throughout R. 


Let us integrate f(z) from Z\ to z 2 . Since every closed path integral of 
f(z) vanishes, the integral is independent of path and depends only on its end 
points. We label the result of the integration F(z ), with 


As an identity, 


F(z 2 ) - F(z{) = mdz. 


-f 

J Z\ 


F(z 2 ) - F( Zl ) £[/(*)-f(zd]dt 

- f(z 0 = —-, 

Z 2 -Z! Z 2 - Z1 


(6.52) 


(6.53) 


using t as another complex variable. Now we take the limit as z 2 —*Z\: 


lim 

Z2-S-Z1 


Q/(fl -/(zi)]rff 

z 2 - Zi 


= 0 


(6.54) 


since /(£) is continuous. 10 Therefore, 


lim 

Z 2 -^Zl 


F(Z2) - F(z i) 

Z2-Z 1 


= F'(z)\ z=Zl = f(zd 


(6.55) 


by definition of derivative [Eq. (6.22)]. We have proved that F'(z) at z = Z\ 
exists and equals / (z\ ). Since Z\ is any point in R, we see that F(z) is analytic. 
Then by Cauchy’s integral formula [compare Eq. (6.51)] F'(z) — f{z ) is also 
analytic, proving Morera’s theorem. 

Drawing once more on our electrostatic analog, we might use /( 2 ) to repre¬ 
sent the electrostatic field E. If the net charge within every closed region in R 
is zero (Gauss’s law), the charge density is everywhere zero in R. Alternatively, 
in terms of the analysis of Section 1.12, f{z ) represents a conservative force 
(by definition of conservative), and then we find that it is always possible to 
express it as the derivative of a potential function F(z). 

An important application of Cauchy’s integral formula is the following 
Cauchy inequality. If f(z) = a n z n is analytic and bounded, \f(z)\ < M 
on a circle of radius r about the origin, then 


\a n \r n < M (Cauchy’s inequality) (6.56) 


gives upper bounds for the coefficients of its Taylor expansion. To prove 
Eq. (6.56), let us define M(f) = maX| 2 | =r f(z) and use the Cauchy integral 
for a„: 


\a n \ 


1 

27T 



m 

j.m+1 


dz 


< M(r) 


2 ttt 
2 jtr n+l 


10 We can quote the mean value theorem of calculus here. 
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SUMMARY 


An immediate consequence of the inequality (6.56) is Liouville’s theorem: 


If f(z) is analytic and bounded in the complex plane, it is a constant. 


In fact, if \f(z) \ < M for all z, then Cauchy’s inequality [Eq. (6.56)] gives 
\On\ < Mr~ n -> 0 as r -»• oo for to > 0. Hence, f(z) = ao. 

Conversely, the slightest deviation of an analytic function from a constant 
value implies that there must be at least one singularity somewhere in the 
infinite complex plane. Apart from the trivial constant functions, then, singu¬ 
larities are a fact of life, and we must learn to live with them. However, we 
shall do more than that. We shall next expand a function in a Laurent series at 
a singularity, and we shall use singularities to develop the powerful and useful 
calculus of residues in Chapter 7. 

A famous application of Liouville’s theorem yields the fundamental the¬ 
orem of algebra (due to C. E Gauss), which states that any polynomial 
P(z) — Y^l= o with to > 0 and a n ^ 0 has to roots. To prove this, suppose 
P(z ) has no zero. Then f(z) = 1 / P(z) is analytic and bounded as \z\ -> oo. 
Hence, f(z) = 1/P is a constant by Liouville’s theorem—a contradiction. Thus, 
P(z ) has at least one root that we can divide out. Then we repeat the process 
for the resulting polynomial of degree to — 1. This leads to the conclusion that 
P(z ) has exactly to roots. 

In summary, if an analytic function / (z) is given on the boundary C of a simply 
connected region R, then the values of the function and all its derivatives are 
known at any point inside that region R in terms of Cauchy integrals 


/(» o) 


1 f f (z)dz 
2 ni Jc z- z 0 ’ 


f (n) (z o) = 


to! / f(z)dz 
2ni % (z- Zo) n+l ' 


These Cauchy integrals are extremely important in numerous physics 
applications. 


EXERCISES 


6.4.1 Show that 


1 

2 tt i 


z m n l dz, m and to integers 


(with the contour encircling the origin once counterclockwise) is a rep¬ 
resentation of the Kronecker 8 mn . 


6.4.2 Solve Exercise 6.3.4 by separating the integrand into partial fractions and 
then applying Cauchy’s integral theorem for multiply connected regions. 
Note. Partial fractions are explained in Section 15.7 in connection with 
Laplace transforms. 
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6.1..‘ Evaluate 


where C is the circle \z\ 
with corners ±2 ± 2 i. 


dz 

/c^T’ 

2. Alternatively, integrate around a square 


6.4. - Assuming that f(z) is analytic on and within a closed contour C and that 
the point zo is within C, show that 


/'(*) 


dz = 


m 


zdz. 


Ic z-z 0 Jc (z - z 0 y 

6.4.5 You know that f(z) is analytic on and within a closed contour C. You 
suspect that the nth derivative f <n) (zo) is given by 

/<»W= — £ m , dz. 

2 tt i % (z- z 0 J n+1 

Using mathematical induction, prove that this expression is correct. 

6.4.6 (a) A function /( z ) is analytic within a closed contour C (and continuous 

on C). If f(z) y 0 within C and \j'(z)\ < M on C, show that 

l/(s) I < M 

for all points within C. 

Hint. Consider w(z) = 1 /f(z). 

(b) If j'(z) = 0 within the contour C, show that the foregoing result does 
not hold—that it is possible to have | f(z)\ = 0 at one or more points 
in the interior with \f(z)\ > 0 over the entire bounding contour. Cite 
a specific example of an analytic function that behaves this way. 

6.4.7 Using the Cauchy integral formula for the nth derivative, convert the 
following Rodrigues’s formulas into Cauchy integrals with appropriate 
contours: 

(a) Legendre 

i 

PrSoc) = — — Or 2 - 1)”. 


2 B n! dx n 
ANS. 


(b) Hermite 


(c) Laguerre 


(-i) n i r (i -z 2 y 
2" 2 ni J (z— x) m+1 


dz. 


H n (x) = (-1 Te x —e~ x 


pX 71 

L n (x)= -—(x n e~ x \ 
n\ dx n 


Note. From these integral representations one can develop gener¬ 
ating functions for these special functions. Compare Sections 11.4, 
13.1, and 13.2. 
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6.4.8 Obtain <p„ z*dz, where C is the unit circle in the first quadrant. Compare 
with the integral from z = 1 parallel to the imaginary axis to 1 + i and 
from there to i parallel to the real axis. 

6.4.9 Evaluate H °° e lz dz along two different paths of your choice. 


6.5 Laurent Expansion 


\ 

^ Taylor Expansion 

The Cauchy integral formula of the preceding section opens up the way for 
another derivation of Taylor’s series (Section 5.6), but this time for functions 
of a complex variable. Suppose we are trying to expand / (z) about z — Zo 
and we have a' = Z\ as the nearest point on the Argand diagram for which 
f(z) is not analytic. We construct a circle C centered at z — zq with radius 
| z' — s 0 | < |si ~ Sol (Fig. 6.13). Since Z\ was assumed to be the nearest point 
at which / (z) was not analytic, f(z) is necessarily analytic on and within C. 

From Eq. (6.47), the Cauchy integral formula, 


m = 


j_ r f(z')dz' 

2 :xi J c z' — z 

J_ r fjz'yiz' 

2 ni Jc (s' - so) - (s - s 0 ) 

J_ / _ /(s')ds' _ 

2 ni % (s' - s Q )[l - (s - s 0 )/(s' - s 0 )] ’ 


(6.57) 


Figure 6.13 

Circular Domain for 
Taylor Expansion 
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where z' is a point on the contour C and z is any point interior to C. We expand 
the denominator of the integrand in Eq. (6.57) by the binomial theorem, which 
generalizes to complex variables as in Example 6.2.1 for other elementary 
functions. Or, we note the identity (for complex t ) 

-t OO 

— = 1 + t + t 2 + t 3 + ■ ■ • = J2 (6.58) 

n =0 

which may easily be verified by multiplying both sides by 1 — t. The infinite 
series, following the methods of Section 5.2, is convergent for \t\ < 1. Upon re¬ 
placing the positive terms a„ in a real series by absolute values \a n \ of complex 
numbers, the convergence criteria of Chapter 5 translate into valid conver¬ 
gence theorems for complex series. 

Now for a point z interior to C, \z — Zo\ < \z' — Zq\, and using Eq. (6.58), 
Eq. (6.57) becomes 


m = 


j_ r ™ (z-z 0 ) n f(z') d z' 
2lti o (2' - 2o) m+1 


(6.59) 


Interchanging the order of integration and summation [valid since Eq. (6.58) 
is uniformly convergent for |f| < 1], we obtain 




f(z ')dz' 
2o) m+1 ' 


Referring to Eq. (6.51), we get 


f(z) = Y^(z~ zoj 


71 = 0 


t / W Q o) 
n\ ’ 


(6.60) 


(6.61) 


which is our desired Taylor expansion. Note that it is based only on the as¬ 
sumption that f(z) is analytic for z — Zo < \zi — Zq\. Just as for real variable 
power series (Section 5.7), this expansion is unique for a given zq. 

From the Taylor expansion for f(z) a binomial theorem may be derived 
(Exercise 6.5.2). 


Schwarz Reflection Principle 


From the binomial expansion of g(z) = (z — .%'o)" for integral n we see that the 
complex conjugate of the function is the function of the complex conjugate, 
for real xo 


g*{z) = (z-x 0 T = (z* - x 0 y = g(z*). (6.62) 


This leads us to the Schwarz reflection principle: 


If a function f(z) is (1) analytic over some region including the real axis 
and (2) real when z is real, then 

f*(z) = /(s*). 


(6.63) 
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Figure 6.14 
Schwarz Reflection 



(Fig. 6.14). It may be proved as follows. Expanding f(z) about some (nonsin¬ 
gular) point Xq on the real axis, 


OO 

f(z) = - X( ^ n 

72=0 


f (n \x q) 

n\ 


(6.64) 


by Eq. (6.60). Since / (z) is analytic atz = xq, this Taylor expansion exists. Since 
f(z) is real when z is real, /'"'(.x'o) must be real for all n. Then when we use 
Eq. (6.62), Eq. (6.63) (the Schwarz reflection principle) follows immediately. 
Exercise 6.5.6 is another form of this principle. The Schwarz reflection princi¬ 
ple applies to all elementary functions and those in Example 6.2.1 in particular. 


Analytic Continuation 


It is natural to think of the values f(z ) of an analytic function / as a single 
entity that is usually defined in some restricted region S\ of the complex plane, 
for example, by a Taylor series (Fig. 6.15). Then f is analytic inside the circle 
of convergence C i, whose radius is given by the distance r\ from the center 
of C i to the nearest singularity of / at Z\ (in Fig. 6.15). If we choose a point 
inside C\ that is farther than i\ from the singularity z\ and make a Taylor 
expansion of / about it (s 2 in Fig. 6.15), then the circle of convergence C 2 
will usually extend beyond the first circle C\. In the overlap region of both 
circles Ci, C 2 the function / is uniquely defined. In the region of the circle C 2 
that extends beyond 6), f(z) is uniquely defined by the Taylor series about the 
center of C 2 and analytic there, although the Taylor series about the center 
of C i is no longer convergent there. After Weierstrass, this process is called 
analytic continuation. It defines the analytic functions in terms of its original 
definition (e.g., in C{) and all its continuations. 
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Figure 6.15 

Analytic 

Continuation 



A specific example is the meromorphic function 

m = T~~—> (6.65) 

1 + 2 

which has a simple pole at z = —1 and is analytic elsewhere. The geometric 
series expansion 

■j OO 

-— = l-2 + 2 2 + ... = £(-«)" (6.66) 

.1 I - & n 

n=0 


converges for N < 1 (i.e., inside the circle C\ in Fig. 6.15). 
Suppose we expand / (z) about z=i so that 


m = 


i _ i 

1 + 2 1 + i + (2 — i) 

z—i (z — i ) 2 

L 1 + i (1 + if 


1 


(l + i)(l + (z-i)/(l + i)) 
' 1 
1 + i 


(6.67) 


converges for \z — i\ < 1 + i\ = y/2. Our circle of convergence is C 2 in 
Fig. 6.15. Now f(z ) is defined by the expansion [Eq. (6.67)] in S 2 that overlaps 
Si and extends further out in the complex plane. 11 This extension is an analytic 


11 One of the most powerful and beautiful results of the more abstract theory of functions of a 
complex variable is that if two analytic functions coincide in any region, such as the overlap of 
Si and S 2 , or coincide on any line segment, they are the same function in the sense that they 
will coincide everywhere as long as they are both well defined. In this case, the agreement of 
the expansions [Eqs. (6.66) and (6.67)] over the region common to Si and S 2 would establish the 
identity of the functions these expansions represent. Then Eq. (6.67) would represent an analytic 
continuation or extension of f(z) into regions not covered by Eq. (6.66). We could equally well 
say that f(z)= 1 /(I + z) is an analytic continuation of either of the series given by Eqs. (6.66) and 
(6.67). 
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Figure 6.16 


\z' -zn\c 1 > 



\z — Zo\> \z' — Zo\c. 


< \z—z 0 


Contour 

line 


continuation, and when we have only isolated singular points to contend with, 
the function can be extended indefinitely. Equations (6.65)-(6.67) are three 
different representations of the same function. Each representation has its own 
domain of convergence. Equation (6.66) is a Maclaurin series. Equation (6.67) 
is a Taylor expansion about z = i. 

Analytic continuation may take many forms and the series expansion just 
considered is not necessarily the most convenient technique. As an alternate 
technique we shall use a recurrence relation in Section 10.1 to extend the 
factorial function around the isolated singular points, z = —n, n— 1, 2, 3,_ 


P Laurent Series 


We frequently encounter functions that are analytic in an annular region, for 
example, of inner radius r and outer radius R, as shown in Fig. 6.16. Drawing an 
imaginary contour line to convert our region into a simply connected region, 
we apply Cauchy’s integral formula, and for two circles, Co and C\, centered at 
z= Zq and with radii ?2 and respectively, where r < ro < r\ < R, we have 12 



( 6 . 68 ) 


Note that in Eq. (6.68) an explicit minus sign has been introduced so that con¬ 
tour C 2 (like Ci) is to be traversed in the positive (counterclockwise) sense. The 
treatment of Eq. (6.68) now proceeds exactly like that of Eq. (6.57) in the devel¬ 
opment of the Taylor series. Each denominator is written as ( z ' — 2 'o) — (z — 2.'o) 


12 We may take rz arbitrarily close to r and ri arbitrarily close to R, maximizing the area enclosed 
between C\ and Cz- 
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and expanded by the binomial theorem, which now follows from the Taylor 
series [Eq. (6.61)]. 

Noting that for C\, \z' — Sol > \z — Zo\, whereas for C 2 , \z' — Zo\ < \z — Sol, 
we find 


1 00 p 


f (z')dz' 


r Cl (*' - zoY +i 

1 00 c 

Y(z- z 0 T n <f> (»' - zo) n ~ l Kz'^dz'■ 


(6.69) 


The minus sign of Eq. (6.68) has been absorbed by the binomial expansion. 
Labeling the first series Si and the second S 2 , 


Si 


1 

2 ni 


OO n 

2>-*o Tf 

n=0 "Ci 


f(z'~)dz' 
{z' — zo) n+l ’ 


(6.70) 


which is the regular Taylor expansion, convergent for \z — Zo\ < \z' — zq\ =ri, 
that is, for all z interior to the larger circle, C). For the second series in 
Eq. (6.68), we have 

1 OO n 

= — y> - zo)~ n & (s' - zoT^fiz'W (6.71) 

2m ^ Ic 2 

convergent for \z—Zo\ > s' — =r 2 , that is, for all z exterior to the smaller 

circle C 2 . Remember, C 2 goes counterclockwise. 

These two series are combined into one series 13 (a Laurent series) by 

OO 

f(z) = J2 a n(z-z 0 Y, (6.72) 


where 



f(z)dz 
(z - s 0 ) m+1 ’ 


(6.73) 


Since, inEq. (6.72), convergence of a binomial expansion is no problem, C may 
be any contour within the annular region r < \z — Zq\ < R encircling zo once in 
a counterclockwise sense. The integrals are independent of the contour, and 
Eq. (6.72) is the Laurent series or Laurent expansion of f(z). 

The use of the contour line (Fig. 6.16) is convenient in converting the 
annular region into a simply connected region. Since our function is analytic 
in this annular region (and therefore single-valued), the contour line is not 
essential and, indeed, does not appear in the final result [Eq. (6.72)]. For n > 0, 
the integrand f(z)/(z— «o) ,!+1 is singular at z= z 0 if f(zo) Y 0- The integrand 
has a pole of order n + 1 at z = z 0 . If / has a first-order zero at z = Zq, then 
f(z)/(z — Zo)" 1 has a pole of order n, etc. The presence of poles is essential 
for the validity of the Laurent formula. 


13 Replace n by —n in S 2 and add. 
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EXAMPLE 6.5.1 


Laurent series coefficients need not come from evaluation of contour in¬ 
tegrals (which may be very intractable). Other techniques such as ordinary 
series expansions often provide the coefficients. 

Numerous examples of Laurent series appear in Chapter 7. We start here 
with a simple example to illustrate the application of Eq. (6.72). 


Laurent Expansion by Integrals Let f(z) — [z(z — 1)] 1 . If we choose 
Zn = 0, then r = 0 and R = 1, f(z) diverging at z = 1. From Eqs. (6.73) and 
(6.72), 


O'n 


1 J) dZ ’ 

2j xi J (z') n+2 (z' — 1) 



E^ 

m =0 


dz’ 

(z') n+2 


(6.74) 


Again, interchanging the order of summation and integration (uniformly con¬ 
vergent series), we have 


i OO 

= -—T 

2 ni 

m =0 


dz' 


( z 'Jn+2- 


If we employ the polar form, as before Eq. (6.35) (of Example 6.3.1), 

1 / rie lB d6 

2tt i ' ^ I ^ n +2— mgi(n+2—yri)6 
m =0 J 


(6.75) 


In other words, 


— 2ni * ^ ' &n+ 2 — ra,l • 


m =0 


CL n — 


— 1 torn > —1, 

0 forn < —1. 

The Laurent expansion about 2=0 [Eq. (6.72)] becomes 


1 1 00 

— = — i -z-z 2 -z 3 — = - y z n 

-i'* 2 y 

n=— 1 


2 ( 2 - 1 ) 


(6.76) 


(6.77) 


(6.78) 


For this simple function the Laurent series can, of course, be obtained by a 
direct binomial expansion or partial fraction and geometric series expansion 
as follows. We expand in partial fractions 


m = 


i 

3(*-l) 


b 0 bi 
2 z—l 


where we determine bo at 2 0, 


lim 2/(2) = lim —I— = — 1 = b 0 + lim —^ 

z-+ 0 z-» 0 2—1 0 2—1 
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EXAMPLE 6.5.2 


SUMMARY 


and b] at z —> 1 similarly, 

1 boO-1) 

lim(s — Y)f(z) — lim - =1 = 6!+ lim-= b\. 

Z->1 Z-* 1 z 2->l Z 

Expanding 1 j{z— 1) in a geometric series yields the Laurent series [Eq. (6.78)]. 


The Laurent series differs from the Taylor series by the obvious feature of 
negative powers of (z — Zo). For this reason, the Laurent series will always 
diverge at least atz = zo and perhaps as far out as some distance r (Fig. 6.16). 


Laurent Expansion by Series Expand /(a) = exp(s) exp(l /z) in a Laurent 
series f(z) = J2 n a n z " about the origin. 

This function is analytic in the complex plane except at 2=0 and z -* oo. 
Moreover, /(1/s) = f(z), so that a_„ = a n . Multiplying the power series 


ee* 



1 

z n n\’ 


we get the constant term a 0 from the products of the m= n terms as 


OO 

a 0 = J2 

m= 0 


l 

( m !) 2 


The coefficient of z k comes from the products of the terms and 

that is, 


Ok — (l—k — 


oo 


E 

m =0 


l 

to!(to+ k)\' 


From the ratio test or the absence of singularities in the finite complex plane 
for z / 0, this Laurent series converges for \z\ > 0. ■ 


Biographical Data 

Laurent, Pierre-Alphonse. Laurent, a French mathematician, was born 
in 1813 and died in 1854. He contributed to complex analysis, his famous 
theorem being published in 1843. 


The Taylor expansion of an analytic function about a regular point follows 
from Cauchy’s integral formulas. The radius of convergence of a Taylor series 
around a regular point is given by its distance to the nearest singularity. An 
analytic function can be expanded in a power series with positive and negative 
(integer) exponents about an arbitrary point, which is called its Laurent series; 
it converges in an annular region around a singular point and becomes its Taylor 
series around a regular point. If there are infinitely many negative exponents 
in its Laurent series the function has an essential singularity; if the Laurent 
series breaks off with a finite negative exponent it has a pole of that order at 
the expansion point. Analytic continuation of an analytic function from some 
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neighborhood of a regular point to its natural domain by means of successive 
Taylor or Laurent series, an integral representation, or functional equation is 
a concept unique to the theory of analytic functions that highlights its power. 

EXERCISES 

6.5.1 Develop the Taylor expansion of ln(l + z). 


OO 


ANS. J](-l)"- 


n 


6.5.2 Derive the binomial expansion 



for many real number. The expansion is convergent for \z\ < 1. 

6.5.3 A function f{z) is analytic on and within the unit circle. Also, f(z) < 1 
for \z\ < 1 and /(0) = 0. Show that \f(z)\ < \z\ for |s| < 1. 

Hint. One approach is to show that f(z)/z is analytic and then express 
l f(zu)/Z( t \" by the Cauchy integral formula. Finally, consider absolute 
magnitudes and take the nth root. This exercise is sometimes called 
Schwarz’s theorem. 

6.5.4 If f(z) is a real function of the complex variable z = x+iy [i.e., / (x) = 
f* (x)] , and the Laurent expansion about the origin, f(z) = a n z n , 
has a a — 0 for n < —N, show that all of the coefficients, a n , are real. 
Hint. Show that z N f (z) is analytic (via Morera’s theorem; Section 6.4). 

6.5.5 A function f(z) = u(x, y) + iv(x, y) satisfies the conditions for the 
Schwarz reflection principle. Show that 

(a) u is an even function of y. (b) v is an odd function of y. 

6.5.6 A function f(z) can be expanded in a Laurent series about the origin 
with the coefficients a n real. Show that the complex conjugate of this 
function of z is the same function of the complex conjugate of z; that 
is, 


f\z) = /(**)■ 


Verify this explicitly for 

(a) f(z) = z n , n an integer, (b) f(z) — sins. 

lif(z) — iz, (ai = i), show that the foregoing statement does not hold. 

6.5.7 The function f(z) is analytic in a domain that includes the real axis. 
When z is real (z — x),f (x) is pure imaginary. 

(a) Show that 


/(*0 =-[/(*)]*■ 


(b) For the specific case f(z ) = iz, develop the Cartesian forms of 
f(z), f(z *), and f *(z). Do not quote the general result of part (a). 
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6.5.8 Develop the first three nonzero terms of the Laurent expansion of 

m = (e* - l)" 1 


about the origin. 

6.5.9 Prove that the Laurent expansion of a given function about a given 
point is unique; that is, if 


OO OO 

/o) = a,n ^ z - z ^ n = bn( - z ~ z °^ n ’ 

rfc=—N n=—N 


show that a„ — b n for all n. 

Hint. Use the Cauchy integral formula. 

6.5.10 (a) Develop a Laurent expansion of /( z ) = [z(z — l)] -1 about the point 

z — 1 valid for small values of \z — 1|. Specify the exact range over 
which your expansion holds. This is an analytic continuation of 
Eq. (6.78). 

(b) Determine the Laurent expansion of f(z) about z = 1 but for \z— 1| 
large. 

Hint. Use partial fraction of this function and the geometric series. 

6.5.11 (a) Given f\(z) = er zl dt (with t. real), show that the domain in 

which f\ (z) exists (and is analytic) is i)l(s) > 0. 

(b) Show that f 2 (z) — 1/z equals f\ (z) over iH(s) > 0 and is therefore 
an analytic continuation of J] (z) over the entire 2 -plane except for 
2 = 0 . 

(c) Expand 1/z about the point z = i. You will have f 2 (z) = 
a n (z — if. What is the domain of f 3 ( 2 )? 

■j OO 

ANS. - = -i V i n (z - i) n , \z-i\< 1. 

z “ 

n =0 

6.5.12 Expand f(z) = sin(y^) in a Laurent series about 2=1. 

6.5.13 Expand 


_ 2 3 - 22 2 + 1 
J{Z ~ ( 2 - 3)(2 2 + 3) 

in a Laurent series about (i) 2=3, (ii) 2 = ±iV 3, (iii) 2=1, and 
(iv) 2 = i(l± V5). 

6.5.14 Find the Laurent series of ln((l + 2 2 )/(l — 2 2 )) at 00 . 

6.5.15 Write 2 in polar form and set up the relations that have to be satisfied 
for In 2 = In \z\ + i arg 2 and ln(l + 2 ) defined by its Maclaurin series to 
be consistent. 
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In the preceding sections, we defined analytic functions and developed some 
of their main features. Now we introduce some of the more geometric aspects 
of functions of complex variables—aspects that will be useful in visualizing 
the integral operations in Chapter 7 and that are valuable in their own right in 
solving Laplace’s equation in two-dimensional systems. 

In ordinary analytic geometry we may take y = f(x) and then plot y versus 
x. Our problem here is more complicated because z is a function of two real 
variables x and y. We use the notation 

w = f(z) = u(x, y) + iv(x, y). (6.79) 

Then for a point in the 2 -plane (specific values for x and y) there may cor¬ 
respond specific values for u(x, y) and v(x, y) that then yield a point in the 
(/.’-plane. As points in the 2 -plane transform or are mapped into points in the w- 
plane, lines or areas in the 2 -plane will be mapped into lines or areas in the 
u/-plane. Our immediate purpose is to see how lines and areas map from the 
2 -plane to the u/-plane for a number of simple functions. 


Translation 


U/=2 + 2o. (6.80) 

The function w is equal to the variable 2 plus a constant, 2 o = -X'o + iyo- By 
Eqs. (6.2) and (6.80), 


u = x + xo, v = y+yo, (6.81) 

representing a pure translation of the coordinate axes as shown in Fig. 6.17. 


Figure 6.17 



Translation 
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Figure 6.18 
Rotation 



Rotation 


Inversion 


w = zzq. (6.82) 

Here, it is convenient to return to the polar representation, using 

w = pe l,p , z—re 18 , and zo = r 0 e ie °, (6.83) 

then 

pe irp = rr 0 e iCS+So) (6.84) 

or 

p = rr 0 , (p — 6 + Oo- (6.85) 

Two things have occurred. First, the modulus r has been modified, either 
expanded or contracted, by the factor ro. Second, the argument 6 has been 
increased by the additive constant 9 0 (Fig. 6.18). This represents a rotation of 
the complex variable through an angle do- For the special case of zo = i, we 
have a pure rotation through n/2 radians. 


1 

w = 

z 

Again, using the polar form, we have 


pe 


i(p _ 


l 

/yglQ 



which shows that 


1 

P=~, (p = -6. 
r 


( 6 . 86 ) 


(6.87) 

( 6 . 88 ) 
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Figure 6.19 
Inversion 



The radial part of Eq. (6.87) shows that inversion clearly. The interior of the unit 
circle is mapped onto the exterior and vice versa (Fig. 6.19). In addition, the 
angular part of Eq. (6.87) shows that the polar angle is reversed in sign. Equa¬ 
tion (6.86) therefore also involves a reflection of the y-axis (like the complex 
conjugate equation). 

To see how lines in the 2 -plane transform into the w-plane, we simply return 
to the Cartesian form: 

u+iv = -. (6.89) 

x+iy 

Rationalizing the right-hand side by multiplying numerator and denominator 
by z* and then equating the real parts and the imaginary parts, we have 


x u 

x 2 + y 2 ’ X u 2 + v 2 ’ 

y _ v 

x 2 + y 2 ’ ^ u 2 + v 2 ' 


(6.90) 


A circle centered at the origin in the 2 -plane has the form 


x 2 + y 2 = r 2 


(6.91) 


and by Eq. (6.90) transforms into 


u 


(u 2 + r 2 ) 2 (u 2 + r 2 ) 2 


= r 2 . 


(6.92) 
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Figure 6.20 

Inversion, Line <->• Circle 



Simplifying Eq. (6.92), we obtain 

u 2 + v 2 = ^ = p 2 , (6.93) 

which describes a circle in the w-plane also centered at the origin. 

The horizontal line y=C\ transforms into 


or 


— v 

— 2 ~ 9 = C 1 

Ur + V 


(6.94) 


u 



(6.95) 


which describes a circle in the w-plane of radius (|)ci and centered at u = 
0, v — — \c\ (Fig. 6.20). We pick up the other three possibilities, x — ±ci, y = 
—Ci, by rotating the xy- axes. In general, any straight line or circle in the 
2 -plane will transform into a straight line or a circle in the w-plane (compare 
Exercise 6.6.1). 


Branch Points and Multivalent Functions 


The three transformations just discussed all involved one-to-one correspon¬ 
dence of points in the 2 -plane to points in the w-plane. Now to illustrate the 
variety of transformations that are possible and the problems that can arise, 
we introduce first a two-to-one correspondence and then a many-to-one cor¬ 
respondence. Finally, we take up the inverses of these two transformations. 
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Consider first the transformation 

w = z 2 , (6.96) 

which leads to 

p = r 2 , (p — 26. (6.97) 

Clearly, our transformation is nonlinear because the modulus is squared, but 
the significant feature of Eq. (6.96) is that the phase angle or argument is 
doubled. This means that the 

• first quadrant of z, 0 < 9 < | —> upper half-plane of w, 0 < <p < jt, 

• upper half-plane of z, 0 < 0 < tt —> whole plane of w, 0 < q> < 2n. 

The lower half-plane of z maps into the already covered entire plane of w, thus 
covering the w-plane a second time. This is our two-to-one correspondence: 
two distinct points in the 2 -plane, zo and zoe m — —Zo, corresponding to the 
single point w — Zg. 

In Cartesian representation, 

u + iv = (x + iyf = x 2 — y 2 + i2xy, (6.98) 

leading to 

u— x 2 — y 2 , v = 2 xy. (6.99) 

Hence, the lines u — C\, v = in the w-plane correspond to x^ — y 2 = ci, 2 xy = 
Co, rectangular (and orthogonal) hyperbolas in the 2 -plane (Fig. 6.21). To every 
point on the hyperbola x 2 — y 2 = C\ in the right half-plane, x > 0, one point on 
the line u = C\ corresponds and vice versa. However, every point on the line 
u — Ci also corresponds to a point on the hyperbola x 2 — y 2 = C\ in the left 
half-plane, x < 0, as already explained. 

It will be shown in Section 6.7 that if lines in the w-plane are orthogonal 
the corresponding lines in the 2 -plane are also orthogonal, as long as the trans¬ 
formation is analytic. Since u — Ci and v = (‘2 are constructed perpendicular 


Figure 6.21 

Mapping—Hyperbolic 
Coordinates 
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to each other, the corresponding hyperbolas in the 2 -plane are orthogonal. We 
have constructed a new orthogonal system of hyperbolic lines. Exercise 2.3.3 
was an analysis of this system. Note that if the hyperbolic lines are electric 
or magnetic lines of force, then we have a quadrupole lens useful in focusing 
beams of high-energy particles. 


The inverse of the fourth transformation [Eq. (6.96)] is 


w = 

(6.100) 

From the relation 


pe i<e = r l/V®/2, 

(6.101) 

and 


2<p — 0, 

(6.102) 


we now have two points in the m;- plane (arguments <p and (p + n) corresponding 
to one point in the 2 -plane (except for the point z — 0). In other words, 0 and 
6 + 2jt correspond to <p and <p + jr, two distinct points in the w-plane. This 
is the complex variable analog of the simple real variable equation y 2 = x, 
in which two values of y, plus and minus, correspond to each value of x. 
Replacing 2 -> 1/2 for z -+ 0 in Eq. (6.100) shows that our function w(z) 
behaves similarly around the point at infinity. 

The important point here is that we can make the function w of Eq. (6.100) 
a single-valued function instead of a double-valued function if we agree to 
restrict 6 to a range such as 0 < 0 <2tt. This may be done by agreeing never 
to cross the line 0 = 0 in the 2 -plane (Fig. 6.22). Such a line of demarcation is 
called a cut line. 

The cut line joins the two branch point singularities at 0 and 00 , where 
the function is clearly not analytic. Any line from z = 0 to infinity would serve 


Figure 6.22 
A Cut Line 
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equally well. The purpose of the cut line is to restrict the argument of 2 . The 
points z and zexp(2jri) coincide in the 2 -plane but yield different points w 
and —w — w exp(7ri) in the w-plane. Hence, in the absence of a cut line the 
function w = 2 1/2 is ambiguous. Alternatively, since the function w = 2 1/2 is 
double valued, we can also glue two sheets of the complex 2 -plane together 
along the cut line so that arg (z) increases beyond 2n along the cut line and 
steps down from 4 t r on the second sheet to the start on the first sheet. This 
construction is called the Riemann surface of w = z ]/2 . We shall encounter 
branch points and cut lines frequently in Chapter 7. 

The transformation 


w = e* (6.103) 

leads to 

pe^ = e x+iy (6.104) 

or 

p = e x , <p = y. (6.105) 

If y ranges from 0 < y < 2 tt (or —tv < y < 7r), then <p covers the same range. 
However, this is the whole w-plane. In other words, a horizontal strip in the 
2 -plane of width 2tt maps into the entire mj- plane. Furthermore, any point x + 
i(jy+ 2mt), in which n is any integer, maps into the same point [by Eq. (6.104)], 
in the w-plane. We have a many-(infinitely many)-to-one correspondence. 
Finally, as the inverse of the fifth transformation [Eq. (6.103)], we have 

w = In 2 . (6.106) 

By expanding it, we obtain 

u+iv = lnre ie = lnr + id. (6.107) 

For a given point zq in the 2 -plane, the argument 0 is unspecified within an 
integral multiple of 2tt. This means that 

v — 9 + 2nn, (6.108) 

and, as in the exponential transformation, we have an infinitely many-to-one 
correspondence. 

Equation (6.106) has a nice physical representation. If we go around the 
unit circle in the 2 -plane, r— 1, and by Eq. (6.107), u— lnr = 0; however, 
v — 9, and 9 is steadily increasing and continues to increase as 9 continues, 
past 2tt. 

The cut line joins the branch point at the origin with infinity. As 9 increases 
past 27r, we glue a new sheet of the complex 2 -plane along the cut line, etc. 
Going around the unit circle in the 2 -plane is like the advance of a screw as it 
is rotated or the ascent of a person walking up a spiral staircase (Fig. 6.23), 
which is the Riemann surface of w = In z. 
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Figure 6.23 

The Riemann 
Surface for In z, a 
Multivalued 
Function 



As in the preceding example, we can also make the correspondence unique 
[and Eq. (6.106) unambiguous] by restricting 6 to a range such as 0 <0 < ‘hr 
by taking the line 0 = 0 (positive real axis) as a cut line. This is equivalent to 
taking one and only one complete turn of the spiral staircase. 

It is because of the multivalued nature of in z that the contour integral 




dz 

z 


— 2ni 0, 


SUMMARY 


integrating about the origin. This property appears in Exercise 6.4.1 and is the 
basis for the entire calculus of residues (Chapter 7). 

The concept of mapping is a very broad and useful one in mathematics. Our 
mapping from a complex z -plane to a complex u;-plane is a simple generaliza¬ 
tion of one definition of function: a mapping of x (from one set) into y in a 
second set. 

A more sophisticated form of mapping appears in Section 1.14, in which 
we use the Dirac delta function Six — a) to map a function f{pc) into its value 
at the point a. In Chapter 15, integral transforms are used to map one function 
fix) in .r-space into a second (related) function Fit) in f-space. 


EXERCISES 

6.6.1 How do circles centered on the origin in the 2 -plane transform for 

(a) wfz) = z+ -, (b) W 2 iz) = z— -, for z ^ 0? 

z z 

What happens when \z\ 1? 


6.6.2 What part of the 2 -plane corresponds to the interior of the unit circle in 
the w-plane if 


(a) w 


: 1 


(b) w = ? 

z + i 
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6.6.3 Discuss the transformations 

(a) w(z) — sin 2 , (c) w(z) = sinh 2 , 

(b) w(z) = cos z, (d) w(z) = cosh , 2 . 

Show how the lines x — C\, y — cy map into the w-plane. Note that t he Iasi 
three transformations can be obtained from the first one by appropriate 
translation and/or rotation. 

6.6.4 Show that the function 

w(z) = ( z 2 - 1) 1/2 

is single-valued if we take —l<a;<l,?/=0asa cut line. 

6.6.5 An integral representation of the Bessel function follows the contour in 
the f-plane shown in Fig. 6.24. Map this contour into the 0-plane with t = 
e 6 . Many additional examples of mapping are given in Chapters 11-13. 

Figure 6.24 

Bessel Function 
Integration Contour 



6.6.6 For noninteger m, show that the binomial expansion of Exercise 6.5.2 
holds only for a suitably defined branch of the function (1 + z) m . Show 
how the 2 -plane is cut. Explain why 2 | < 1 may be taken as the circle 
of convergence for the expansion of this branch, in light of the cut you 
have chosen. 

6.6.7 The Taylor expansion of Exercises 6.5.2 and 6.6.6 is not suitable for 
branches other than the one suitably defined branch of the function 
(1+2')"' for noninteger m. [Note that other branches cannot have the same 
Taylor expansion since they must be distinguishable.] Using the same 
branch cut of the earlier exercises for all other branches, find the corre¬ 
sponding Taylor expansions detailing the phase assignments and Taylor 
coefficients. 


6.7 Conformal Mapping 


In Section 6.6, hyperbolas were mapped into straight lines and straight lines 
were mapped into circles. However, in all these transformations one feature, 
angles, stayed constant, which we now address. This constancy was a result 
of the fact that all the transformations of Section 6.6 were analytic. 
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Figure 6.25 
Conformal 

Mapping—Preservation 
of Angles 



As long as w = f(z) is an analytic function, we have 

Aw 
A z' 

Assuming that this equation is in polar form, we may equate modulus to mod¬ 
ulus and argument to argument. For the latter (assuming that df/dz 0), 


df dw 
— = — = lim 
dz dz Az—»o 


(6.109) 


Aw Aw 

arg lim -= Inn arg- 

Az—>0 A Z Az—>0 A Z 


df 

= lim arg Aw — lim arg A z — arg — = a, 

Az—>0 Az—>0 dz 


( 6 . 110 ) 


where a, the argument of the derivative, may depend on z but is a constant for 
fixed z, independent of the direction of approach. To see the significance of 
this, consider two curves, C z in the 2 -plane and the corresponding curve C w in 
the w-plane (Fig. 6.25). The increment Az is shown at an angle of 0 relative to 
the real (x) axis, whereas the corresponding increment Aw forms an angle of 
i -p with the real (u) axis. From Eq. (6.110), 


<p — 9 + a, 


( 6 . 111 ) 


or any line in the 2 -plane is rotated through an angle a in the uj -plane as long 
as w is an analytic transformation and the derivative is not zero. 14 

Since this result holds for any line through zq, it will hold for a pair of lines. 
Then for the angle between these two lines 


<P2 — <P2 — (&2 + a) — (di + a) — 9 2 — 0i, (6.112) 

which shows that the included angle is preserved under an analytic transforma¬ 
tion. Such angle-preserving transformations are called conformal. The 
rotation angle a will, in general, depend on z. In addition, \f'(z)\ will usually 
be a function of z. 


14 If df/dz = 0, its argument or phase is undefined and the (analytic) transformation will not 
necessarily preserve angles. 
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SUMMARY 


Historically, conformal transformations have been of great importance to sci¬ 
entists and engineers in solving Laplace’s equation for problems of electro¬ 
statics, hydrodynamics, heat flow, and so on. LTnfortunately, the conformal 
transformation approach, however elegant, is limited to problems that can be 
reduced to two dimensions. The method is often beautiful if there is a high 
degree of symmetry present but often impossible if the symmetry is broken or 
absent. 

Because of these limitations and primarily because high-speed electronic 
computers offer a useful alternative (iterative solution of the partial differential 
equation), the details and applications of conformal mappings are omitted. 


EXERCISES 

6.7.1 Expand w(pc) in a Taylor series about the point z = zq, where f'(zo) — 0. 
(Angles are not preserved.) Show that if the first n— 1 derivatives vanish 
but f (n> (Z()) ^ 0, then angles in the 2 -plane with vertices at z= zo appear 
in the m;- plane multiplied by n. 

6.7.2 In the transformation 

- a — w 

e = ——, 

a + w 

how do the coordinate lines in the 2 -plane transform? What coordinate 
system have you constructed? 

6.7.3 Develop a conformal transformation that maps a circle in the 2 -plane into 
a circle in the m;- plane. Consider first circles with centers at the origin 
and then those with arbitrary centers. Plot several cases using graphical 
software. 

6.7.4 Develop a conformal transformation that maps straight lines parallel to 
the coordinate axes in the 2 -plane into parabolas in the u;-plane. Plot 
several parabolas using graphical software. 
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Chapter 5. 
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Spiegel, M. R. (1985). Complex Variables. McGraw-Hill, New York. An excellent 
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Functions of a 
Complex Variable II 

Calculus of Residues 


7.1 Singularities 


In this chapter we return to the line of analysis that started with the Cauchy- 
Riemann conditions in Chapter 6 and led to the Laurent expansion (Sec¬ 
tion 6.5). The Laurent expansion represents a generalization of the Taylor 
series in the presence of singularities. We define the point Zo as an isolated 
singular point of the function f(z) if f{z ) is not analytic at z = Zo but is 
analytic and single valued in a punctured disk 0 < \z — Zq\ < R for some 
positive R. For rational functions, which are quotients of polynomials, f(z ) = 
P(z)/Q(z), the only singularities arise from zeros of the denominator if the 
numerator is nonzero there. For example, f(z) = from Exercise 

6.5.13 has simple poles at z = ±/V3 and z — 3 and is regular everywhere 
else. A function that is analytic throughout the finite complex plane except 
for isolated poles is called meromorphic. Examples are entire functions 
that have no singularities in the finite complex plane, such as e z , sin z, cos z, 
rational functions with a finite number of poles, or tans, cots with infinitely 
many isolated simple poles at s = rm ands = (2n+l)7r/2forn = 0, ±1, ±2, ..., 
respectively. 

From Cauchy’s integral we learned that a loop integral of a function around 
a simple pole gives a nonzero result, whereas higher order poles do not con¬ 
tribute to the integral (Example 6.3.1). We consider in this chapter the general¬ 
ization of this case to meromorphic functions leading to the residue theorem, 
which has important applications to many integrals that physicists and engi¬ 
neers encounter, some of which we will discuss. Here, singularities, and simple 
poles in particular, play a dominant role. 
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In the Laurent expansion of / (z) about Zq 

OO 

m= Y c 7 - 1 ) 

n=—o o 


if a n — 0 for n < — m < 0 and a-_ m ^ 0, we say that zq is a pole of order m. 
For instance, if m = 1—that is, if gl i /(z — zo) is the first nonvanishing tenu 
in the Laurent series—we have a pole of order 1, often called a simple pole. 
Example 6.5.1 is a relevant case: The function 

/(*) = [a(*-i)r 1 = -- + -!- r 

z z— 1 

has a simple pole at the origin and at z = 1. Its square, f 2 (z), has poles of order 

2 at the same places and [f(z)] m has poles of order m = 1,2,3_In contrast, 

the function 


e* 


OO 


E 

n= 0 


i 

z n n\ 


from Example 6.5.2 has poles of any order at z = 0. 

If there are poles of any order (i.e., the summation in the Laurent series at 
Zq continues to n = — oo), then zq is a pole of infinite order and is called an 
essential singularity. These essential singularities have many pathological 
features. For instance, we can show that in any small neighborhood of an 
essential singularity of f(z) the function f(z) comes arbitrarily close to any 
(and therefore every) preselected complex quantity wq} Literally, the entire w- 
plane is mapped into the neighborhood of the point zq, the essential singularity. 
One point of fundamental difference between a pole of finite order and an 
essential singularity is that a pole of order m can be removed by multiplying 
f(z ) by (z — Z()) m . This obviously cannot be done for an essential singularity. 

The behavior of f(z) as z -* oo is defined in terms of the behavior of /(I/O 
as t —»■ 0. Consider the function 


sin z 


= E 


(-1 ) n z 


1 n-,2n+l 


=g (2n+l)! 
As z —► oo, we replace the z by l/t to obtain 


sin 


(-i)" 


t) E (2n+ \y.t 2n+1 


(7.2) 


(7.3) 


Clearly, from the definition, sins: has an essential singularity at infinity. This 
result could be anticipated from Exercise 6.1.9 since 


sin z = sin iy — i sinh y, when x—0, 


lr This theorem is due to Picard. A proof is given by E. C. Titchmarsh, The Theory of Functions, 
2nd ed. Oxford Univ. Press, New York (1939). 
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which approaches infinity exponentially as y —*■ oo. Thus, although the abso¬ 
lute value of sin a; for real x is equal to or less than unity, the absolute value of 
sin z is not bounded. The same applies to cos z. 


Branch Points 


There is another sort of singularity that will be important in the later sections 
of this chapter and that we encountered in Chapter 6 in the context of inverting 
powers and the exponential function, namely roots and logarithms. Consider 


m = 


in which a is not an integer. 2 As z moves around the unit circle from e° to 

e 2ni , 


f(z) -» e 2nai ± e 0i 


for nonintegral a. As in Section 6.6, we have a branch point at the origin and 
another at infinity. If we set z = 1 ft, a similar analysis for t —> 0 shows that 
t — 0; that is, oo is also a branch point. The points e° l and e ljI% in the 2 -plane 
coincide but these coincident points lead to different values of /(s); that 
is, / (z) is a multivalued function. The problem is resolved by constructing a 
cut line joining both branch points so that f(z) will be uniquely specified 
for a given point in the 2 -plane. For z" the cut line can go out at any angle. 
Note that the point at infinity must be included here; that is, the cut line may 
join finite branch points via the point at infinity. The next example is a case in 
point. If a = p/q is a rational number, then q is called the order of the branch 
point because one needs to go around the branch point q times before coming 
back to the starting point or, equivalently, the Riemann surface of z l/q and z plq 
is made up of q sheets, as discussed in Chapter 6. If a is irrational, then the 
order of the branch point is infinite, just as for the logarithm. 

Note that a function with a branch point and a required cut line will not be 
continuous across the cut line. In general, there will be a phase difference on 
opposite sides of this cut line. Exercise 7.2.23 is an example of this situation. 
Hence, line integrals on opposite sides of this branch point cut line will not 
generally cancel each other. Numerous examples of this case appear in the 
exercises. 

The contour line used to convert a multiply connected region into a simply 
connected region (Section 6.3) is completely different. Our function is contin¬ 
uous across this contour line, and no phase difference exists. 


EXAMPLE 7.1.1 


Function with TWo Branch Points Consider the function 
f(z) = (z 2 - 1) 1/2 = (z+ l) 1/2 0s- 1) 1/2 . 


(7.4) 


2 z = 0 is technically a singular point because z a has only a finite number of derivatives, whereas 
an analytic function is guaranteed an infinite number of derivatives (Section 6.4). The problem is 
that f(z) is not single-valued as we encircle the origin. The Cauchy integral formula may not be 
applied. 
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The first factor on the right-hand side, (z + 1) 1/2 , has a branch point at z — — 1. 
The second factor has a branch point at z = +1. Each branch point has order 
2 because the Riemann surface is made up of two sheets. At infinity f{z) has 
a simple pole. This is best seen by substituting z—l/t and making a binomial 
expansion att — 0 


(a* - D 1/2 = 7(1 - * 2 ) 1/2 = 7 £ 


n =0 


1/2 

n 


(-1 ft 2n = 


1 1 3 

-t- -r + 
2 8 


The cut line has to connect both branch points so that it is not possi¬ 
ble to encircle either branch point completely. To check the possibility of 
taking the line segment joining z = +1 and z — —1 as a cut line, let us fol¬ 
low the phases of these two factors as we move along the contour shown in 
Fig. 7.1. 

For convenience in following the changes of phase, let z + 1 = re 1 " and 
z — 1 = pe up . Then the phase of f(z) is (6 + <p)/ 2. We start at point 1, where 
both 2+1 and z— 1 have a phase of zero. Moving from point 1 to point 2, <p, 
the phase of z — 1 = pe ltp increases by it. (z — 1 becomes negative.) <p then 
stays constant until the circle is completed, moving from 6 to 7 .6, the phase 
of z + 1 = re' 0 , shows a similar behavior increasing by 2 jt as we move from 3 
to 5. The phase of the function f(z) = (z+ l ) 1/2 (2 — 1) 1/2 = r 1 / 2 p 1 / 2 e l (o++/ 2 i s 
(6 + <p)/ 2. This is tabulated in the final column of Table 7.1. 


Table 7.1 
Phase Angle 


Point 

e 

V 

(0 + v?)/2 

1 

0 

0 

0 

2 

0 

JT 

n/2 

3 

0 

JT 

JT /2 

4 

JT 

JT 

JT 

5 

2 7T 

JT 

3jt/2 

6 

2n 

JT 

3jt/2 

7 

2 JT 

2 71 

2 n 


Figure 7.1 

Cut Line Joining Two 
Branch Points at ± 1 
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Figure 7.2 

Branch Points Joined by 

(a) a Finite Cut Line and 

(b) Two Cut Lines from 1 
to oo and — 1 to — oc 
that Form a Single Cut 
Line Through the Point 
at Infinity. Phase Angles 
are Measured as Shown 
in (a) 

1. The phase at points 5 and 6 is not the same as the phase at points 2 and 3. 
This behavior can be expected at a branch point cut line. 

2. The phase at point 7 exceeds that at point 1 by 27r and the function f(z) = 
o 2 - l ) 1 / 2 is therefore single-valued for the contour shown, encircling 
both branch points. 

If we take the .x-axis —1 < x < lasa cut line, f(z) is uniquely specified. 
Alternatively, the positive .x-axis for x > 1 and the negative .x-axis for x < — 1 
may be taken as cut lines. In this case, the branch points at ± 1 are joined by the 
cut line via the point at infinity. Again, the branch points cannot be encircled 
and the function remains single-valued. ■ 

Generalizing from this example, the phase of a function 
m = Mz)-f 2 (z)-Mzy-- 

is the algebraic sum of the phase of its individual factors: 

arg f(z) = arg f Y (z) + arg / 2 (» + arg f 3 (z) H-. 

The phase of an individual factor may be taken as the arctangent of the ratio 
of its imaginary part to its real part, 

arg Mz) = tan^Oi/wO, 

but one should be aware of the different branches of arctangent. For the case 
of a factor of the form 



Two features emerge: 


fi(z) = (z - Z 0 ), 

the phase corresponds to the phase angle of a two-dimensional vector from 
+z 0 to z, with the phase increasing by 27r as the point +z 0 is encircled provided, 
it is measured without crossing a cut line (z 3 = 1 in Fig. 7.2a). Conversely the 
traversal of any closed loop not encircling z 0 does not change the phase of z— Zq . 


SUMMARY 


Poles are the simplest singularities, and functions that have only poles be¬ 
sides regular points are called meromorphic. Examples are tan z and ratios 
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of polynomials. Branch points are singularities characteristic of multivalent 
functions. Examples are fractional powers of the complex variable z, whereas 
the logarithm has branch points of infinite order at the origin of the complex 
plane and at infinity. Essential singularities are the most complicated ones, 
and many functions have one, such as cos z, sin z at infinity. 


EXERCISES 


7.1.1 The function f(z ) expanded in a Laurent series exhibits a pole of order 
m at z = Zo. Show that the coefficient of (z — «o) _1 , cli, is given by 


with 


1 


a_i = 


(to— 1)! dz : 


jm-1 

Zy[(Z-ZoTm]*=Z 


0-1 = [O-Z0 )/(2)]a=* 0 > 

when the pole is a simple pole (to = 1). These equations for a_i are 
extremely useful in determining the residue to be used in the residue 
theorem of the next section. 

Hint. The technique that was so successful in proving the uniqueness of 
power series (Section 5.7) will work here also. 


7.1.2 A function f(z) can be represented by 


m = 


/i(g) 

/2(»)’ 


where f\ (z) and f>(z) are analytic. The denominator ft(z) vanishes 
at z—zo, showing that f(z) has a pole at z=Zq. However, f\(z») ^ 0, 
/ 2 '(2o) ^ 0. Show that a_i, the coefficient of (z — 2o)~' in a Laurent ex¬ 
pansion of f(z) at z = z 0 , is given by 


a-_ i = 


/iOo) 
f 2 (*o )' 


This result leads to the Heaviside expansion theorem (Section 15.12). 


7.1.3 In analogy with Example 7.1.1, consider in detail the phase of each factor 
and the resultant overall phase of f(z) = ( z 2 + 1) 1/2 following a contour 
similar to that of Fig. 7.1 but encircling the new branch points. 

7.1.4 As an example of an essential singularity, consider e l/z as z approaches 
zero. For any complex number zo, Zo ^ 0, show that 

e 1/z — zo 

has an infinite number of solutions. 


7.1.5 If the analytic function f(z) goes to zero for \z\ —► oo, show that its 
residue (a-_i as defined in Exercise 7.1.1) at infinity is — lim,, zf(z'): 
if /( z ) has a finite (nonzero) limit at infinity, show that its residue at 
infinity is - linn^ z 2 f(z). 
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7.2 Calculus of Residues 




Residue Theorem 


If the Laurent expansion of a function f(z) — a " ( z ~ z oT i s integrated 

term by term by using a closed contour that encircles one isolated singular 
point zq once in a counterclockwise sense, we obtain (Example 6.3.1) 


(z — zo) n dz = a, 


0s-z 0 ) n+1 


n- 


— 0 , — 1 . 


However, if n— —1, using the polar form z = zq + re l ° we find that 

ire l0 dd 


a-i(b(z — zo) l dz = a -1 


re u 


— 2nia-i. 


(7.5) 


(7.6) 


The first and simplest case of a residue occurred in Example 6.3.1 involving 
<f z n dz — ‘ZirniJ)- where the integration is anticlockwise around a circle of 
radius r. Of all powers z", only l/z contributes. 

Summarizing Eqs. (7.5) and (7.6), we have 


<p f(z)dz= a- 1 . 


(7.7) 


The constant a_i, the coefficient of (z — 2o) -1 in the Laurent expansion, is 
called the residue of f(z) at z — Zq. 

A set of isolated singularities can be handled by deforming our contour as 
shown in Fig. 7.3. Cauchy’s integral theorem (Section 6.3) leads to 


f(z)dz+ ® f(z)dz-\ -= 0. (7.8) 

Ci J C2 


f(z) dz+ (f) f(z) dz + 
Co 


Figure 7.3 

Excluding Isolated 
Singularities 
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The circular integral around any given singular point is given by Eq. (7.7), 



f(z)dz — -2jzia_ 1Zi , 


(7.9) 


assuming a Laurent expansion about the singular point z = Zi. The negative 
sign comes from the clockwise integration as shown in Fig. 7.3. Combining 
Eqs. (7.8) and (7.9), we have 


f{z)dz = 2jii(a- 1Zo + a- lzi + a_i S2 H-) 


= 2jt i (sum of enclosed residues). 


(7.10) 


This is the residue theorem. The problem of evaluating one or more con¬ 
tour integrals is replaced by the algebraic problem of computing residues at the 
enclosed singular points (poles of order 1). In the remainder of this section, we 
apply the residue theorem to a wide variety of definite integrals of mathemati¬ 
cal and physical interest. The residue theorem will also be needed in Chapter 15 
for a variety of integral transforms, particularly the inverse Laplace transform. 
We also use the residue theorem to develop the concept of the Cauchy principal 
value. 

Using the transformation z = l/w for w —»• 0, we can find the nature of 
a singularity at z -* oo and the residue of a function f(z) with just isolated 
singularities and no branch points. In such cases, we know that 


^{residues in the finite 2 -plane) + {residue at z —> oo) = 0. 


Evaluation of Definite Integrals 


The calculus of residues is useful in evaluating a wide variety of definite inte¬ 
grals in both physical and purely mathematical problems. We consider, first, 

integrals of the form 


/ = 


f*2n 

Jo 


/(sind, cos d)dd, 


(7.11) 


where / is finite for all values of 0. We also require / to be a rational function 
of sin d and cos d so that it will be single-valued. Let 


z = e lB , dz = ie l6 d0. 


From this, 

.dz z—z~ l z + z~ l 

dO = —i —, sind = — -, cosd = —-—. (7.12) 

z 2i 2 

Our integral becomes 



(7.13) 
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EXAMPLE 7.2.1 


with the path of integration the unit circle. By the residue theorem [Eq. (7.10)], 
I = {—%)2ni ^2 residues within the unit circle. (7.14) 

Note that we want to determine the residues of / (z)/z. Illustrations of integrals 
of this type are provided by Exercises 7.2.6-7.2.9. 


Reciprocal Cosine Our problem is to evaluate the definite integral 

r2n d6 


I = l T 


lei < 1. 


e cos f 


By Eq. (7.13), this becomes 
I = -i 


dz 


unit circle ^[1 4“ (e / 2)(s + Z ^ )J 

dz 


.2 

1 E J z 2 + (2 /e)z+ 1 

The denominator has roots 


1 1 rr~ 

z_ = -VI — e 


2 and z + = — - + -x/l — e 2 


EE EE 

and can be written as (z — z + )(z — z_). Here, z + is within the unit circle; z_ is 
outside. Near z + the denominator can be expanded as 


o 2 d , 

sr+ -z+l = 0 + (z-z+)—[z 


dz 


-z+1 


= (z-z+) 2z - 


so that the residue at z + is 


l 

2z + +2/€ 


. (See Exercise 7.1.1.) Then by Eq. (7.14), 


.2 n 1 

/ = —i- ■ 2m- --— 

£ 2z+2/s 


z=-l/e+(l/e)vl-s 2 


Hence, 


n 2n 

Jo 


dd 


2ir 


1 + £ COS 6 VT" 


|e| < 1 . 


Now consider a class of definite integrals that have the form 

/ OO 

f(x)dx (7.15) 

-OO 

and satisfy the two conditions: 

• f(z) is analytic in the upper half-plane except for a finite number of poles. 
(It will be assumed that there are no poles on the real axis. If poles are 
present on the real axis, they may be included or excluded as discussed 
elsewhere.) 

• f(z) vanishes as strongly 3 as l/z 2 for \z\ -» oc, 0 < arg z < it. 


3 We could use f{z) vanishes faster than 1 /z\ that is, the second condition is overly sufficient, and 
we wish to have f(z) single-valued. 
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Figure 7.4 

Path of Integration 
is a Half Circle in 
the Upper Half 
Plane 



With these conditions, we may take as a contour of integration the real axis 
and a semicircle in the upper half-plane as shown in Fig. 7.4. We let the radius 
R of the semicircle become infinitely large. Then 

r pR r*TC 

® f(z) dz = lim / f(x) dx + lim / f(Re l ")iRe‘"d(0 

J R^oa J~ R R-+ oo J 0 

= 2 jti ^ residues (upper half-plane) (7.16) 


From the second condition, the second integral (over the semicircle) vanishes 
and 

/ OO _ 

/ Qv) dx = 2 jt i ^2 residues (upper half-plane). (7.17) 

-OO 

Note that a corresponding result is obtained when / is analytic in the lower 
half-plane and we use a contour in the lower half-plane. In that case, the contour 
will be tracked clockwise and the residues will enter with a minus sign. 


EXAMPLE 7.2.2 


Inverse Polynomial 


From Eq. (7.16), 


Evaluate 

r°° dx 

J-oo 1 + X 1 ' 



dx 

1 + X 2 


2ni 22 residues (upper half-plane). 


(7.18) 


Here and in every other similar problem, we have the question: Where are the 
poles? Rewriting the integrand as 

11 1 

z 2 + 1 z+i z-i 

we see that there are simple poles (order 1) at z — i and z = —i. 


(7.19) 
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EXAMPLE 7.2.3 


A simple pole at z = zo indicates (and is indicated by) a Laurent expansion 
of the form 


m = 


a_ i 

z-z 0 


OO 

+ ao + ^ a n (z - z 0 ) n . 

n= 1 


The residue a-_i is easily isolated as (Exercise 7.1.1) 


(7.20) 


0-1 = (z- Zo )/(«) U=«o- ( 7 - 21 ) 


Using Eq. (7.21), we find that the residue at z = i is 1 /2i, whereas that at z = —i 
is — 1 /2 i. Another way to see this is to write the partial fraction decomposition: 


Then 


2 i\z—i z + i 



dx 

1 + x 2 


= 2 jti ■ 



(7.22) 


Here, we used cli = l/2i for the residue of the one included pole at z = i. 
Readers should satisfy themselves that it is possible to use the lower semicircle 
and that this choice will lead to the same result: I — it. ■ 


A more delicate problem is provided by the next example. 

Evaluation of Definite Integrals Consider definite integrals of the 
form 



with a real and positive. This is a Fourier transform (Chapter 15). We assume 
the two conditions: 

• f(z ) is analytic in the upper half-plane except for a finite number of poles. 

* lim f(z) — 0, 0 < arg z <n. (7.24) 

|^| —>00 

Note that this is a less restrictive condition than the second condition imposed 
on f(z) for integrating f (x) dx. 

We employ the contour shown in Fig. 7.4 because the exponential factor 
goes rapidly to zero in the upper half-plane. The application of the calculus 
of residues is the same as the one just considered, but here we have to work 
harder to show that the integral over the (infinite) semicircle goes to zero. This 
integral becomes 


Ir = 


f 


KSene 


iaR cos 6 —aR sin 0 • 


iRe ie d6. 


(7.25) 
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Figure 7.5 

(a) y = (2/77)0; 

(b) y = sin 0 



Let R be so large that f(z) = \f(Re' , )\ < s. Then 




/»7r/2 

\Ir\ < eR I e~ aRsine do = 2 sR J e~ aRsme dd. 


(7.26) 


In the range [0, tt/2]. 


Therefore (Fig. 7.5), 


—6 < sind. 

IT 


r*l 2 

\Ir\<2sR e~ aR2Bln dQ. 

Jo 


(7.27) 


Now, integrating by inspection, we obtain 


Finally, 


141 < 2 sR 


1 - e~ aR 
aR2/n 


hm \I R \ < —£. (7.28) 

r-> oo a 

From condition (7.24), s —> 0 as R -> oo and 

hm 141=0. (7.29) 

R-^oo 

This useful result is sometimes called Jordan’s lemma. With it, we are pre¬ 
pared to deal with Fourier integrals of the form shown in Eq. (7.23). 

Using the contour shown in Fig. 7.4, we have 



f(x)e iax dx + lim I R = 

i?—^ OO 


27 ri ^2 residues (upper half-plane). 
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Figure 7.6 

Bypassing Singular 
Points 




x 0 



Since the integral over the upper semicircle Ir vanishes as R -> oc (Jordan’s 
lemma), 


/ OO _ 

f(x)e" lx dx = 2ni ^ residues (upper half-plane) (a > 0). (7.30) 

-OO 

This result actually holds more generally for complex a with 31(a) >0. ■ 


Cauchy Principal Value 


Occasionally, an isolated first-order pole will be directly on the contour of 
integration. In this case, we may deform the contour to include or exclude the 
residue as desired by including a semicircular detour of infinitesimal radius. 
This is shown in Fig. 7.6. The integration over the semicircle then gives, with 

z — xo — Se 1 ^, dz = iSe l<p d(p, 


r dz _ . r 

J Z-Xo Jjt 

[ = i [ dip = 

J z-x 0 


dcp — in, i.e., nia-i 


if counterclockwise, 


= —in, i.e., — nid-\ if clockwise. 

J 71 

This contribution, + or —, appears on the left-hand side of Eq. (7.10). If our 
detour were clockwise, the residue would not be enclosed and there would 
be no corresponding term on the right-hand side of Eq. (7.10). However, if our 
detour were counterclockwise, this residue would be enclosed by the contour 
C and a term 2n ia_\ would appear on the right-hand side of Eq. (7.10). The 
net result for either a clockwise or counterclockwise detour is that a simple 
pole on the contour is counted as one-half what it would be if it were within 
the contour. 

For instance, let us suppose that f(z) with a simple pole at z — xo is inte¬ 
grated over the entire real axis assuming \f(z)\ -* 0 for \z\ -> oc fast enough 
(faster than 1/1 z\ ) that the integrals in question are finite. The contour is closed 
with an infinite semicircle in the upper half-plane (Fig. 7.7). Then 

/ xo-S 

f ( x ) dx+ I f(z) dz 


+ 


/»oo 

Jocq+8 


L 

f{pc) dx+ i 

Jc 


infinite semicircle 


= 2ni ^2 enclosed residues. 


(7.31) 
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Figure 7.7 

Closing the Contour 
with an Infinite 
Radius Semicircle 



If the small semicircle C, m includes Xo (by going below the x-axis; counter¬ 
clockwise), Xo is enclosed, and its contribution appears twice—as itid-i in 
f c and as 2it id- \ in the term 2 iti'ff enclosed residues—for a net contribution 
of Ttia-i on the right-hand side of Eq. (7.31). If the upper small semicircle is se¬ 
lected, xo is excluded. The only contribution is from the clockwise integration 
over C Xo , which yields —itid-\. Moving this to the extreme right of Eq. (7.11), 
we have -t-?ria_i, as before. 

The integrals along the x-axis may be combined and the semicircle radius 
permitted to approach zero. We therefore define 


lim 


/ x 0 -S poo | 

/(x) dx+ I /(x) dx | 

-oo J ocq+8 \ 


■f 


fix ) dx. 


(7.32) 


P indicates the Cauchy principal value and represents the preceding limit¬ 
ing process. Note that the Cauchy principal value is a balancing or canceling 
process; for even-order poles, P fff fix ) dx is not finite because there is no 
cancellation. In the vicinity of our singularity at z = Xo, 


(7.33) 

X — Xo 

This is odd, relative to Xo- The symmetric or even interval (relative to Xo) 
provides cancellation of the shaded areas (Fig. 7.8). The contribution of the 
singularity is in the integration about the semicircle. 

Sometimes, this same limiting technique is applied to the integration limits 
±oo. If there is no singularity, we may define 


P 


f 


fix) dx 


r 

lim / fix) dx. 

“-*- 00 J-a 


(7.34) 


An alternate treatment moves the pole off the contour and then considers the 
limiting behavior as it is brought back, in which the singular points are moved 
off the contour in such a way that the integral is forced into the form desired to 
satisfy the boundary conditions of the physical problem (for Green’s functions 
this is often the case; see Examples 7.2.5 and 16.3.2). The principal value limit 
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EXAMPLE 7.2.4 


Figure 7.8 
Contour 



is not necessary when a pole is removed by a zero of a numerator function. 
The integral 

sin 2 sins 

/ - dz—2 / - dz = it, 

J-oc Z Jo z 

evaluated next, is a case in point. 


Singularity on Contour of Integration The problem is to evaluate 


1 = 



sin a; 

- dx. 

x 


(7.35) 


This may be taken as half the imaginary part 4 of 


h = P 



e lz dz 

z 


(7.36) 


Now the only pole is a simple pole at z = 0 and the residue there by Eq. (7.21) 
is a_i = 1. We choose the contour shown in Fig. 7.9 (i) to avoid the pole, 
(ii) to include the real axis, and (iii) to yield a vanishingly small integrand for 
z = iy, y —»■ oo. Note that in this case a large (infinite) semicircle in the lower 
half-plane would be disastrous. We have 




e lz dz 

-h 

Z 


r R e ix dx r 
Jr & JC 2 


e lz dz 


= 0, 


(7.37) 


4 One can use f[(e lz — e lz )/2iz]dz, but then two different contours will be needed for the two 
exponentials. 
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Figure 7.9 

Singularity on 
Contour 



the final zero coming from the residue theorem [Eq. (7.10)]. By Jordan’s lemma, 


and 




e lz dz 

z 



(7.38) 


(7.39) 


The integral over the small semicircle yields (—)tt£ times the residue of 1, the 
minus as a result of going clockwise. Taking the imaginary part, 5 we have 


or 



sin a; 

- dx = Tt 

x 


sin a; 

x 



(7.40) 


(7.41) 


The contour of Fig. 7.9, although convenient, is not at all unique. Another 
choice of contour for evaluating Eq. (7.35) is presented as Exercise 7.2.14. ■ 


EXAMPLE 7.2.5 


Quantum Mechanical Scattering The quantum mechanical analysis of 
scattering leads to the function 



x sin xdx 
x 2 — er 2 ’ 


(7.42) 


where a is real and positive. From the physical conditions of the problem there 
is a further requirement: 1(a ) must have the form e" 7 so that it will represent 
an outgoing scattered wave. 


“Alternatively, we may combine the integrals of Eq. (7.37) as 


r>*?+ (V-= [ 

J — R X J r X J r 


e™— + I e lx — = I (e“ - e ~ ix )— = 2 i 


r R sir 

1 / — 
Jr 3 


-dx. 
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Figure 7.10 
Contours 



Using 


1 1 _ 

sinz = —e — —e 

2i 2% 


we write Eq. (7.42) in the complex plane as 

/(a) = /,+ J 2> 


with 


h 


h 


1 ze iz 

2 i J -oo ’ 

1 /- 00 
-2 


(7.43) 


(7.44) 


(7.45) 


Integral 7i is similar to Example 7.2.2 and, as in that case, we may complete the 
contour by an infinite semicircle in the upper half-plane as shown in Fig. 7.10a. 
For I 2 , the exponential is negative and we complete the contour by an infinite 
semicircle in the lower half-plane, as shown in Fig. 7.10b. As in Example 7.2.2, 
neither semicircle contributes anything to the integral-Jordan’s lemma. 

There is still the problem of locating the poles and evaluating the residues. 
We find poles at z = +er and z — —a on the contour of integration. The 
residues are (Exercises 7.1.1 and 7.2.1) 


z = a z = —a 

e ia e -m 

~2 ~Y~ 

e -ia e ia 


h 


2 


2 
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Detouring around the poles, as shown in Fig. 7.10 (it matters little whether we 
go above or below), we find that the residue theorem leads to 


PIi — 7ti 



+ jri 





(7.46) 


because we have enclosed the singularity at z — a but excluded the one at 
z = — a. In similar fashion, but noting that the contour for I 2 is clockwise, 


PI-2 — Tti 







Adding Eqs. (7.46) and (7.47), we have 

PI (a) = PI X + PI -2 = ^( e ia + e~ ia ) = tt cos a. 

Lt 


(7.47) 


(7.48) 


This is a perfectly good evaluation of Eq. (7.42), but unfortunately the cosine 
dependence is appropriate for a standing wave and not for the outgoing scat¬ 
tered wave as specified. 

To obtain the desired form, we try a different technique. We note that the 
integral, Eq. (7.42), is not absolutely convergent and its value will depend on 
the method of evaluation. Instead of dodging around the singular points, let 
us move them off the real axis. Specifically, let a —> a + iy, —a —» —<7 — iy, 
where y is positive but small and will eventually be made to approach zero; 
that is, for I\ we include one pole and for I 2 the other one: 


/ + (<r)=limJ(<r+iy). (7.49) 

y-> 0 

With this simple substitution, the first integral l\ becomes 

/ 1 \ e iO-Hr) 

h(p + iy) = 2nil -J (7.50) 

by direct application of the residue theorem. Also, 

/ _ 1 \ o^+iy) 

12(0 + iy) = —-—. (7.51) 

Adding Eqs. (7.50) and (7.51) and then letting y -* 0, we obtain 

/+(ct) = lim[/i(cr + iy) + I 2 (ct + iy)] 
y->o 

= lim 7te i( - a+iv ^ = 7te icr , (7.52) 

y^O 

a result that does fit the boundary conditions of our scattering problem. 

It is interesting to note that the substitution a -> a — iy would have led to 

7_(ct) = Txe~ ia , (7.53) 

which could represent an incoming wave. Our earlier result [Eq. (7.48)] is 
seen to be the arithmetic average of Eqs. (7.52) and (7.53). This average is 
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the Cauchy principal value of the integral. Note that we have these possi¬ 
bilities [Eqs. (7.48), (7.52), and (7.53)] because our integral is not uniquely 
defined until we specify the particular limiting process (or average) to be 
used. ■ 


Pole Expansion of Meromorphic Functions 


Analytic functions f(z) that have only separated poles as singularities are 
called meromorphic. Examples are ratios of polynomials, cot z (see Example 
7.2.7) and of entire functions. For simplicity, we assume that these poles 
at finite z = z n with 0 < \z\ < \z%\ < ■ ■ • are all simple, with residues b„. 
Then an expansion of f(z ) in terms of b n (z — z„y ] depends only on intrinsic 
properties of f(z), in contrast to the Taylor expansion about an arbitrary 
analytic point of f(z) or the Laurent expansion about some singular point 
of f(z). 


EXAMPLE 7.2.6 


Rational Functions Rational functions are ratios of polynomials that can 
be completely factored according to the fundamental theorem of algebra. A 
partial fraction expansion then generates the pole expansion. Let us consider 
a few simple examples. We check that the meromorphic function 


m = 


i 

z(z+ 1) 


1 1 

z z +1 


has simple poles at z = 0, z = — 1 with residues Z ^ +Vj U=o = vfqU=o = 1 and 
afz+i) lg=—l — ~ I; respectively. At oo, f(z) has a second order zero. Simi¬ 

larly, the meromorphic function 


m = 




has simple poles at z = ±1 with residues 4 Z j|, =1 = jbj-| 2=1 = \ and 
\z= — i = U=_i = — \. At infinity, / (z) has a second-order zero also. 

Let us consider a series of concentric circles C n about the origin so that 
C n includes Z], Zi ,..., Zn but no other poles, its radius Ii n -> oc as n—> oc. To 
guarantee convergence we assume that \f(z)\ < sR n for an arbitrarily small 
positive constant e and all z on C n , and / is regular at the origin. Then the 
series 


m = m + E Mo - 1 + M} (7.54) 

n= 1 


converges to / (z). 

To prove this theorem (due to Mittag-Leffler) we use the residue theorem 
to evaluate the following contour integral for z inside C„ and not equal to a 
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EXAMPLE 7.2.7 


singular point of 


In — 


ini Jc„ 


f O) 


-dw 


= E 


2ni J Cn w(w - z) 

bm , /0) - /(0) 


77] z) 


(7.55) 


where w in the denominator is needed for convergence and w — z to produce 
f(z) via the residue theorem. On C n we have for n —»■ oo, 


17J ^ 2 it R r . 


max,. 


icJ/(«0 I 


e/f,, 


2 nR n (R n — |s|) 

0 as w -* oo. Using /„ 


< £ 


N 

• 0 in Eq. (7.55) proves 


for R„ "+> |«| so that \I n 
Eq. (7.54). 

if \m\ < sR f ,', 1 for some positive integer p, then we evaluate similarly 
the integral 

f(w) 


'*=hl 


U) p+l (w — z) 

and obtain the analogous pole expansion 


dw —»• 0 as n—> oo 


/(*) = /(0) + 2/(0) + • • ■ + 


?/ W (0) , ^ b n Z p+1 /Z P n +1 


pi 


E 

n= 1 


z— Z n 


(7.56) 


Note that the convergence of the series in Eqs. (7.54) and (7.56) is implied 
by the bound of \fiz)\ for |«| -* oo. 

Pole Expansion of Cotangent The meromorphic function f(z) — tt cot ttz 
has simple poles at z = ±n, for n = 0, 1,2,... with residues d w s ™*2dz I z=n = 
= 1. Before we apply the Mittag-Leffler theorem, we have to subtract 

1 


the pole at z = 0. Then the pole expansion becomes 

1 1 


7T COt7T2- 




n=l 


z—n n z+n n 
2 z 




n= 1 


z — n 


n 


= E. .. 

n =1 

We check this result by taking the logarithm of the product for the sine [Eq. 
(7.60)] and differentiating 


7 X COtTTZ = - + ^2 


n =1 


i 


i 


n{\ - f) nil + f) 


1 


~* + E rr 


n= 1 


i 


i 


n z+n 


Finally, let us also compare the pole expansion of the rational functions of 
Example 7.2.6 with the earlier partial fraction forms. From the Mittag-Leffler 
theorem, we get 


1 


= -1+ x 


If 1 


2Vs- 1 


ii-l 


1 


2U+1 


- 1 
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which is in agreement with our earlier result. In order to apply the Mittag- 


Leffler theorem to 


1 


2 ( 2 + 1 ) 


, we first must subtract the pole at z = 0. We obtain 


1 


1 


= -1 


z{z + 1 ) 2 Z+l 

which again is consistent with Example 7.2.6. 


- 1 


Product Expansion of E ntire Functions 


A function f(z) that is analytic for all finite z is called an entire function. The 
logarithmic derivative f'/f is a meromorphic function with a pole expansion, 
which can be used to get a product expansion of / (z). 

If f(z ) has a simple zero at z— z n , then f(z) = (z — z n )g(z) with analytic 
g(z) and g(z n ) f 0. Hence, the logarithmic derivative 


• rw -fr-<r' + *' w 


m 


9(z) 


(7.57) 


has a simple pole at z = z n with residue 1, and g'/g is analytic there. If f'/f 
satisfies the conditions that led to the pole expansion in Eq. (7.54), then 


ffz) _ /(0) 

m /(o) 

holds. Integrating Eq. (7.58) yields 

/'O) 


£ 

n= 1 


(7.58) 


L 


o m 


dz = In / (z) — In /(0) 


zff 0) 


£ 


ln(s - Zn) - In (~z n ) H- 


/(°) „=i 

and exponentiating we obtain the product expansion 


/O) = /(0) exp 


zff 0) 

/( 0 ) 


71=1 


n>- 


Z n 


ZtZn 


(7.59) 


Examples are the product expansions for 


sin z - 


n= — oo 
n^0 




71=1 


COS^ 


=n 

77=1 


1 


(n - 1/2) 2 7t 2 J ’ 


(7.60) 


Note that the sine-product expansion is derived by applying Eq. (7.59) to 
f(z) = sin z/z rather than to sin z, so that /(()) = 1 and 


/'( 0 ) = 


cos 2 sin. 2 \ 
z z 2 ) 


1 _ 2 _ 1 Z 

z 2 2 6 


= 0, 


z =0 
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inserting the power expansions for sin sand cos z. Another example is the prod¬ 
uct expansion of the gamma function, which will be discussed in Chapter 10. 

As a consequence of Eq. (7.57), the contour integral of the logarithmic 
derivative may be used to count the number Nj of zeros (including their mul¬ 
tiplicities) of the function /(s) inside the contour C: 


1 

2iti 


'c m 


dz = Nf. 


(7.61) 


This follows from the leading term of the Taylor expansion of / at a zero zq, 
f(z) = (z- Zo)/'(2o), with /'(s 0 ) ^ 0, so that 


= — „ith fi — 

f(Z) Z — So 2tTI J S — So 


where we integrate about a small circle around so- For a zero of order m (a 
positive integer), where /■ m) (so) / 0 is the lowest nonvanishing derivative, 
the leading term of the Taylor expansion becomes 


m = 


(s-soT 

m! 


f m (z o), 


f(z) 

m 


m 

s- s 0 ' 


Thus, the logarithmic derivative counts the zero with its multiplicity m. More¬ 
over, using 


/ 


■f(g) 

m 


dz = In /(s) = In 


|/(s)|+iarg /(s), 


(7.62) 


we see that the real part in Eq. (7.62) does not change as z moves once around 
the contour, whereas the corresponding change in arg f called A c arg(/), 
must be 


A c arg (/) = 2jtN f . (7.63) 

This leads to Rouche’s theorem: 


If /(s) and g(z) are analytic inside and on a closed contour C, and 
\g(z)\ < |/(s)| on C, then f(z) and /(s) + g(z) have the same number of 
zeros inside C. 


To show this we use 

2nN f+g = A^arg (/+(/)= A c arg (/) + A c arg^ 1 + j 

Since \g\ < \ f\ on C, the point w = 1 + g(z)/f(z) is always an interior point 
of the circle in the w-plane with center at 1 and radius 1. Hence, arg (1 + g/f) 
must return to its original value when z moves around C (it passes to the 
right of the origin); it cannot decrease or increase by a multiple of 2jt so that 
A c arg (1 + g/f) = 0. 
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SUMMARY 


Rouche’s theorem may be used for an alternative proof of the fundamental 
theorem of algebra: A polynomial ]F " =0 a 'm z ' n with a„ ^ 0 has n zeros. We 
define f(z) = a n z n . Then / has an w-l'old zero at the origin and no other zeros. 
Let g(z) = Eo ” 1 a mZ m . We apply Rouche’s theorem to a circle C with center 
at the origin and radius R > 1. On C, f(z) = \a n \R" and 


|0O)I < |«o| + |oi|i2H-h \a n+ i\R n 1 < 



R n ~ l 


Hence, |< 7 ( 2)1 < \J(z)\ for z on C provided R > (Elio For all 

sufficiently large circles C, therefore, / + g — EEo a mZ m has n zeros inside 
C according to Rouche’s theorem. 


The residue theorem 

* f(z)dz —2iti ^2 [a_i 2j ] =2ni ^(residues enclosed by C) 

JC Zj€C 

and its applications to definite integrals are of central importance for mathe¬ 
maticians and physicists. When it is applied to merophorphic functions it yields 
an intrinsic pole expansion that depends on the first-order pole locations and 
their residues provided the functions behave reasonably at \z\ -* 00 . When 
it is applied to the logarithmic derivative of an entire function, it leads to its 
product expansion. 

The residue theorem is the workhorse for solving definite integrals, at least 
for physicists and engineers. However, the mathematician also employs it to 
derive pole expansions for meromorphic functions and product expansions 
for entire functions. It forms part of the tool kit of every scientist. 


EXERCISES 


7.2.1 Determine the nature of the singularities of each of the following func¬ 
tions and evaluate the residues (a > 0 ): 

1 (b) 1 


(a) 

(c) 

(e) 

(g) 


z 2 + a 2 ' 


(z 2 + a 2 ) 


2X2’ 


ze 


,+iz 


f2 


(d) 


(f) 


{z 2 + a 2 ) 2 ' 

sin 1 /z 


n,2 


ze 


,+iz 


n+iz 


z 2 + a 2 

k 


f2 


(h) 


1 ’ 


0 < k < 1 . 


Hint. For the point at infinity, use the transformation w = 1/z for 
\z\ —> 0. For the residue, transform f(z) dz into g(w) dw and look at 
the behavior of g{w'). 


7.2.2 The statement that the integral halfway around a singular point is equal 
to one-half the integral all the way around was limited to simple poles. 
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Show, by a specific example, that 


f f(z)dz = l; (f f(z)dz 

J semicircle ^ J circle 


does not necessarily hold if the integral encircles a pole of higher order. 
Hint. Try f(z ) = z~ 2 . 

7.2.3 A function f(z) is analytic along the real axis except for a third-order 
pole at z = .x'n. The Laurent expansion about z = x o has the form 


m = 


a~ 3 

(Z - Xq? 


+ 


a_ i 

O - Xq) 


+ g(z), 


with g(z) analytic at z = Show that the Cauchy principal value 
technique is applicable in the sense that 


(a) lim 

s^o 


/ Xq—S poo 

fix ) dx+ fix') dx 

-oo J xo+8 


is well behaved. 


(b) / fiz)dz = 

Jc xo 

where C Xo denotes a small semicircle about z = x q. 


7.2.4 The unit step function is defined as 


uis — a) = 


0, s < a 
1, s > a. 


Show that uis) has the integral representations 

l r°° g* 3 ® \ 

(a) uis) = lim - / - dx, (b) u(s) — - 

e^o+ 2 jti J_ oc x — is 2 

Note. The parameter s is real. 


i poo p ixs 

—P / —dx. 
2 ixi J _ 0 o x 


7.2.5 Most of the special functions of mathematical physics may be generated 
(defined) by a generating function of the form 


git, x) = Y] fnix)t n . 
n 

Given the following integral representations, derive the corresponding 
generating function: 

(a) Bessel 

Ux) = — & e^-^r^dt. 

2iri f 


(b) Legendre 


Pnix) = 


1 

2jri 


i 


il-2tx + t 2 ) l/2 t 71 l dt. 
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(c) Hermite 


(d) Laguerre 


= -5U «r' 

27 XI J 


H n (x) = ~~~. (b e- t2+2tx r n - l dt. 


L n (x) 


= J-<£ 


e -Xt/(l-t) 


-dt. 


2jli J (1 — t)t n+x 
Each of the contours encircles the origin and no other singular points. 
7.2.6 Generalizing Example 7.2.1, show that 

dd r 2n dd 2 jr 


r 2n de _ r 2 

Jo a ± b cos 9 Jo 


a±b cos 6 
What happens if \b | > |a|? 
7.2.7 Show that 

de 


a±b sin 6 ( a 2 — h 2 ) 1 / 2 


, for a > | 6 |. 


7.2.8 Show that 


/" 


/»2 7 T 

Jo 1 


na 


(a + cos d ) 2 (a 2 — l ) 3 / 2 ’ 

d9 2tc 


a > 1 . 


— 2 £cosd + t 2 1 — t 2 ’ 

What happens if |t| > 1? What happens if |f | = 1? 
7.2.9 With the calculus of residues show that 


for Id < 1 . 


/' 


_ (2 ny. _(2n— 1 )!! 


cos" ede = it 


2 2n (n!) 2 


(2 n) 


n — 0 , 1 , 2 , .... 


(The double factorial notation is defined in Chapter 5 and Section 10.1.) 
Hint, cose = \{e ie + e _lS ) = 5 ( 2 + 2 -1 ), |a| = 1. 


7.2.1C Evaluate 


f 


cos bx — cos ax 


-dx, a > b > 0 . 




ANS. n(a - ft). 


7.2.11 Prove that 


f 


sin 2 x jr 
——dx = 


1-00 & 

Hint, sin 2 x= |(1 — cos 2x). 

7.2.12 A quantum mechanical calculation of a transition probability leads to 
the function f(t, o>) = 2(1 — cos a>t)/ar. Show that 


f 


fit, co)d(i> — 2n t. 
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7.2.13 Show that (a > 0) 
cos a; 


(a) 


(b) 


/ 

J — C 


t r 

dx = —e 


^ + or a 

How is the right side modified if cos x is replaced by cos kxl 
xsinx 


f 

J—c 


-dx = ne 


l-oo + or 

How is the right side modified if sin x is replaced by sin kx? 

These integrals may also be interpreted as Fourier cosine and sine 
transforms (Chapter 15). 


7.2.14 Use the contour shown in Fig. 7.11 with R -* oo to prove that 



sin a: 

- dx = n. 

x 


Figure 7.11 
Contour 


-R + iR 

/ 

R + iR 


.f 



-R -r 

r R 


7.2.15 In the quantum theory of atomic collisions we encounter the integral 

‘-f 


—e ipt dt, 

t 


in which p is real. Show that 


1 = 0, \p\ > 1 
I = TV, \p\ < 1. 


What happens if p = ±1? 
7.2.16 Evaluate 


ro 

Jo 1 


(In x~f 


dx. 


■ x* 


(a) by appropriate series expansion of the integrand to obtain 

OO 

4j](—i) B ( 2 »+i r 3 , 


n =0 
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(b) and by contour integration to obtain 

8 ' 

Hint, x —> z— e r . Try the contour shown in Fig. 7.12, letting R —> oo. 

Figure 7.12 
Contour 


-R + in 

1 

R + in 



-R 

R 


7.2.17 Show that 


7.2.18 Evaluate 


7.2.19 Show that 



dx 

(x 2 + a 2 ) 2 


7T 

4^’ 


a > 0 . 



x 


l+ x‘ 


-dx. 


ANS. n/y/2. 


poo poo 

/ cos (t 2 )dt. = / sin(f 2 )df = 

Jo Jo 


■s/n 

2V2’ 


Hint. Try the contour shown in Fig. 7.13. 

Note. These are the Fresnel integrals for the special case of infinity as the 
upper limit. For the general case of a varying upper limit, asymptotic 
expansions of the Fresnel integrals are the topic of Exercise 5.10.2. 
Spherical Bessel expansions are the subject of Exercise 12.4.13. 


7.2.20 Several ofthe Bromwich integrals (Section 15.12) involve aportionthat 
may be approximated by 


m = 


pax 
J a—i 


a+iy „zt 

—dz 
~l/2 aZ ’ 
a—iy ^ 


where a and t are positive and finite. Show that 


lim I(y) — 0. 

y—^oo 


7.2.21 Show that 



1 

1 +x n 


dx = 


ir/n 

sin(7r/w) 


Hint. Try the contour shown in Fig. 7.14. 
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Figure 7.13 
Contour 



Figure 7.14 
Contour 
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7.2.22 (a) Show that 


f(z) = z 4 - 2 cos 26>^ + l 


has zeros at e 19 , e 1(1 , —e'°, and —e 18 . 
(b) Show that 


f 


dx 


Tt 


It 


l—oo — 2cos20a; 2 + 1 2sin0 2 1 / 2 (1 — cos20)b 2 ' 
Exercise 7.2.21 (n = 4) is a special case of this result. 

7.2.23 Evaluate the integral 

x 2 dx 


/_ 


_oo x 4 — 2 cos 20a; 2 + 1 

by contour integration. 

7.2.24 The integral in Exercise 7.2.16 may be transformed into 


/•OO 

l 


y 2 7r 3 

:- ^-dy= TT:- 

f e -2 2/ 16 

Evaluate this integral by the Gauss-Laguerre quadrature and compare 
your numerical result with 7r 3 /16. 


ANS. Integral = 1.93775 (10 points). 


d 


7.3 Method of Steepest Descents 


Analytic Landscape 


In analyzing problems in mathematical physics, one often finds it desirable 
to know the behavior of a function for large values of the variable or some 
parameter s, that is, the asymptotic behavior of the function. Specific examples 
are furnished by the gamma function (Chapter 10) and various Bessel functions 
(Chapter 12). All these analytic functions [ /(.s)] are defined by integrals 


7(s) = / F(z, s ) dz, 

Jc 


(7.64) 


where F is analytic in z and depends on a real parameter s. We write F(z) 
simply whenever possible. 

So far, we have evaluated such definite integrals of analytic functions along 
the real axis (the initial path C) by deforming the path C to C' in the complex 
plane so that \F\ becomes small for all z on C'. [See Example 7.2.3 for 1(a).] 
This method succeeds as long as only isolated singularities occur in the area 
between C and C'. Their contributions are taken into account by applying the 
residue theorem of Section 7.2. The residues (from the simple pole part) give 
a measure of the singularities, where |F| -> oo, which usually dominate and 
determine the value of the integral. 
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The behavior of an integral as in Eq. (7.64) clearly depends on the absolute 
value | FI of the integrand. Moreover, the contours of | F\ at constant steps A | F\ 
often become more closely spaced as s becomes large. Let us focus on a plot 
of | F(x + iy) | 2 = U 2 (x, y) + V 2 (x, y) rather than the real part SHF = U and 
the imaginary part J F = V separately. Such a plot of F| 2 over the complex 
plane is called the analytic landscape after Jensen, who, in 1912, proved that 
it has only saddle points and troughs, but no peaks. Moreover, the troughs 
reach down all the way to the complex plane, that is, go to |F| = 0. In the 
absence of singularities, saddle points are next in line to dominate the 
integral in Eq. (7.64). Jensen’s theorem explains why only saddle points (and 
singularities) of the integrand are so important for integrals. Hence the name 
saddle point method for finding the asymptotic behavior of /(s) for s —> oc that 
we describe now. At a saddle point the real part U of F has a local maximum, 
for example, which implies that 

dU _ dU _ o 
dx 3 y 

and therefore, by the use of the Cauchy-Riemann conditions of Section 6.2, 

dV _ dV _ o 
dx 3 y 

so that V has a minimum or vice versa, and F'(z) = 0. Jensen’s theorem 
prevents U and V from having both a maximum or minimum. See Fig. 7.15 
for a typical shape (and Exercises 6.2.3 and 6.2.4). We will choose the path 
C so that it runs over the saddle point and in the valleys elsewhere 


Figure 7.15 
A Saddle Point 
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so that the saddle point dominates the value of I (s '). This deformation of the 
path is analogous to the applications of the residue theorem to integrals over 
functions with poles. In the rare case that there are several saddle points, we 
treat each alike, and their contributions will add to /(.s) for large s. 

To prove that there are no peaks, assume there is one at zq. That is, 
| F ( 2 o )| 2 > \F(z)\ 2 for all z of a neighborhood \z — Zo\ < r. If 

OO 

F(z) = J2 - z oT 

n =0 

is the Taylor expansion at zo, the mean value rn(F) on the circle z = Zq + 
r exp(/'y) becomes 

1 f 2jI 

m(F)=— / \F(z g + re l,p )\ 2 d(p 
An Jo 

1 r 2jI °° 

= — / V a* m a n r m+n e i(n - m ^d(p 
2tt Jo 

° u m,n =0 
oo 

= ^ l«o| 2 = \F(_Z 0 )\ 2 , (7.65) 

n =0 

using orthogonality, f 2lT exp [i(n — vi)(p]d(p = S mn . Since m(F) is the mean 
value of \F\ 2 on the circle of radius r, there must be a point Z\ on it so that 
I^C^i)! 2 > m(F) > \F(zo)\ 2 , which contradicts our assumption. Hence, there 
can be no such peak. 

Next, let us assume there is a minimum at zo so that 0 < |F(2 o)| 2 < |F^)| 2 
for all 2 of a neighborhood of zq. In other words, the dip in the valley does not 
go down to the complex plane. Then F(z) \ 2 > 0 and since 1/ F(z) is analytic 
there, it has a Taylor expansion and Zo would be a peak of 1 /\F(z)\ 2 , which is 
impossible. This proves Jensen’s theorem. We now turn our attention to the 
integral in Eq. (7.64). 


P Saddle Point Method 

Since a saddle point Zq of | F(z) \ 2 lies above the complex plane, that is, | Fizo) \ 2 > 
0, so F(Z[)) ^ 0, we write F in exponential form, F(z) = in its vicinity 

without loss of generality. Note that having no zero in the complex plane is a 
characteristic property of the exponential function. Moreover, any saddle point 
wilh F(z) = 0 becomes a trough of \F(z)\ 2 because F(z)\ 2 > 0. A case in point 
is the function z 2 at 2 = 0 where 2z — 0. Here z L — (x + iyf = x 2 — y 2 + 2 ixy, 
and 2 xy has a saddle point at z = 0, as well as at x 2 — y 2 , but \z\ 4 has a 
trough there. The phase of F(zq) is given by 3f(zo). At z 0 the tangential plane 
is horizontal; that is, 


dF 

dz 



2=Z0 


or equivalently 


df 

dz 


This condition locates the saddle point. 


= 0. 


z=z a 
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Our next goal is to determine the direction of steepest descent, the heart 
of the saddle point method. To this end, we use the power-series expansion of 
/ at z 0 , 

m = /Oo)+ 1 roo)o - ^o) 2 + • • •, (7.66) 

or 

/(*) = f(z o) + ^cr(*o) + s)(z - z 0 ) 2 , (7.67) 

upon collecting all higher powers in the (small) e. Let us take f"(zo) ^ 0 for 
simplicity. Then 

f"(zo)(z-Zo) 2 = ~t\ t real (7.68) 

defines a line through zo (saddle point axis in Fig. 7. 15). At^o, t = 0. Along the 
axis 3f"(zo)(z— Zq ) 2 is zero and v — 3f(z) 3 f(zo) is constant if s in Eq. (7.67) 
is neglected. Thus, F has constant phase along the axis. Equation (7.68) can 
also be expressed in terms of angles, 

arg(z - zo) = ^ \ arg f"(z 0 ) = constant. (7.69) 

Since | F(z)\ 2 = exp(2;H/) varies monotonically with i)i/, F(z)\ 2 ~ exp(—t 2 ) 
falls off exponentially from its maximum at t = 0 along this axis. Hence the 
name steepest descent for this method of extracting the asymptotic behavior 
of I(s) for s —> oo. The line through zg defined by 

f"(z 0 )(z-z 0 f = +t 2 (7.70) 

is orthogonal to this axis (dashed line in Fig. 7.15), which is evident from its 
angle 

argOz - z 0 ) = - ^ arg f"(z 0 ) = constant, (7.71) 

when compared with Eq. (7.69). Here, \F(z)\ 2 grows exponentially. 

The curves JH/(s) = d\f(zg)go through^ so that JH[(/"(so) + e)(s—«o) 2 ] = 
0, or ( f"(zg ) + s)(z — Zo) 2 = it for real t. Expressing this in angles as 

arg (z - z 0 ) = ^ ^ arg(/"(z 0 ) + e), t > 0 (7.72a) 

arg(^ - zo) = arg (f"(z 0 ) + e), t < 0, (7.72b) 

and comparing with Eqs. (7.69) and (7.71), we note that these curves (dot- 
dashed line in Fig. 7.15) divide the saddle point region into four sectors—two 
with iH f(z) > 91 f(zo) (hence F(z) > \F(zo)\) shown shaded in Fig. 7.15 and 
two with 91/(2) < 91 f(z 0 ) (hence \F(z)\ < F(zo)\). They are at ±| angles 
from the axis. Thus, the integration path has to avoid the shaded areas where 
\F\ rises. If a path is chosen to run up the slopes above the saddle point, 
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large imaginary parts of f(z) are involved that lead to rapid oscillations of 
F(z) — e-f® anc [ cancelling contributions to the integral. So far, our treatment 
has been general, except for f"(zo) ^ 0, which can be relaxed. 

Now we are ready to specialize the integrand F further in order to 
tie up the path selection with the asymptotic behavior as s -* oo. We as¬ 
sume that the parameter s appears linearly in the exponent; that is, we re¬ 
place exp f(z, s) -> exp(s/(s)). This dependence on s often occurs in physics 
applications and ensures that the saddle point at zo grows with s -» oo [if 
3t/(s 0 ) > 0]. In order to account for the region far away from the saddle point 
that is not influenced by s, we include another analytic function g(z) that varies 
slowly near the saddle point and is independent of s. Altogether, our integral 
has the more appropriate and specific form 


7(s) = f g{z)e si ^dz. 


(7.73) 


Jc 


Our goal now is to estimate the integral 7(s) near the saddle point. The 
path of steepest descent is the saddle point axis when we neglect the 
higher order terms, e, in Eq. (7.67). With s, the path of steepest descent is the 
curve close to the axis within the unshaded sectors, where the phase v = 'Jf(z) 
is strictly constant, whereas 3/(s) is only approximately constant on the axis. 
We approximate 7 (s) by the integral along the piece of the axis inside the patch 
in Fig. 7.15, where [compare with Eq. (7.68)] 



(7.74) 


Here, the interval [a, b] will be given below. We find 





(7.75a) 


and the omitted part is small and can be estimated because 31 (/ (z) — f(zo)) 
has an upper negative bound (e.g., —R) that depends on the size of the saddle 
point patch in Fig. 7.15 [i.e., the values of a, b in Eq. (7.74)] that we choose. In 
Eq. (7.75a) we use the Taylor expansions 


/(s 0 + xe w ) = f(z 0 ) + ^f"(_z 0 )e 2ia x 2 H- 

g(z 0 + xe w ) = g(_zo) + g'(zo)e w x H-, 


(7.75b) 


and recall from Eq. (7.74) that 


^/"(s 0 )e 2 “ = - l -\f\z 0 )\ < 0. 


We find for the leading term 



(7.76) 


Since the integrand in Eq. (7.76) is essentially zero when x departs appreciably 
from the origin, we let b —¥■ oo and a —* — oo. The small error involved is 
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straightforward to estimate. Noting that the remaining integral is just a Gauss 
error integral, 



-!c 2 x 2 > 1 

e 2 c x dx= - 



"dx = 


\[2jc 


we obtain the asymptotic formula 

|s/"Oo)l /2 

for the case in which one saddle point dominates. Here, the phase a was 
introduced in Eqs. (7.74) and (7.69). 

One note of warning: We assumed that the only significant contribution to 
the integral came from the immediate vicinity of the saddle point z = zo- This 
condition must be checked for each new problem (Exercise 7.3.3). 


EXAMPLE 7.3.1 


Asymptotic Form of the Hankel Function, In Section 12.3, it 

is shown that the Hankel functions, which satisfy Bessel’s equation, may be 
defined by 


i r ooe in rip 

H %0 = -/ (7.78) 

Ttl Jem 2 U+1 

H v\s) = ±[° ( 7 . 79 ) 

7™ Jc 2 ( ooe-“) 2 U+1 


where ooe m = — oo and the contour C\ is the curve in the upper half-plane of 
Fig. 7.16 that starts at the origin and ends at — oo. The contour C 2 is in the lower 
half-plane, starting at ooe~ m and ending at the origin. We apply the method of 
steepest descents to the first Hankel function, which is conveniently 

in the form specified by Eq. (7.73), with g(z) = i/(mz' ,+x ) and f(z) given by 


m = 




(7.80) 


Figure 7.16 

Hankel Function 
Contours 
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By differentiating, we obtain 


S'V = -2 + 2^’ ™=~i- 


(7.81) 


Setting f'(z) — 0, we obtain 


z = i, —i. 


(7.82) 


Hence, there are saddle points at z = +i and z = — i. At z = i, /"(£) = —i, 
or arg f"(i) = —tt/ 2, so that the saddle point direction is given by Eq. (7.74) 
asa=| + f=|jr. For the integral for H^(s) we must choose the contour 
through the point z — +i so that it starts at the origin, moves out tangentially 
to the positive real axis, and then moves around through the saddle point at 
z — +i in the direction given by the angle a = 3tt/ 4 and then on out to minus 
infinity, asymptotic with the negative real axis. The path of steepest ascent that 
we must avoid has the phase — g arg /"(?’) = f according to Eq. (7.71) and is 
orthogonal to the axis, our path of steepest descent. 

Direct substitution into Eq. (7.77) with a = 37r/4 yields 


ff y (1) (s) = 


1 V2tt i-v-igCs/aXi-Vbg 3 **/ 4 
Vi |(s/2)(—2/i 3 )| 1 /2 


e (ijr/2)C-u-2) e is e i(37i/4)^ 


(7.83) 


By combining terms, we finally obtain 



(7.84) 


as the leading term of the asymptotic expansion of the Hankel function Hfp(s). 
Additional terms, if desired, may be picked up from the power series of / and 
g in Eq. (7.75b). The other Hankel function can be treated similarly using the 
saddle point at z= —i. ■ 


EXAMPLE 7.3.2 


Asymptotic Form of the Factorial Function In many physical problems, 
particularly in the field of statistical mechanics, it is desirable to have an accu¬ 
rate approximation of the gamma or factorial function of very large numbers. 
As developed in Section 10.1, the factorial function may be defined by the Euler 
integral 


d = 


/>00 nOO 

/ p s e~ p dp = s s+1 / e s(hlz - z) dz. (7.85) 

Jo Jo 


Here, we have made the substitution p = zs in order to put the integral into the 
form required by Eq. (7.73). As before, we assume that s is real and positive, 
from which it follows that the integrand vanishes at the limits 0 and oo. By 
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SUMMARY 


differentiating the z dependence appearing in the exponent, we obtain 

f'{z) = A (in * - *) = - - 1, f"(z) =-\, (7.86) 

dz z z l 

which shows that the point z= 1 is a saddle point, and arg /"(1) = arg(— 1) = 
it. According to Eq. (7.74) we let 

z-l = xe ia , a = | - 1 arg/'(1) = | - |, (7.87) 

with x small to describe the contour in the vicinity of the saddle point. Front 
this we see that the direction of steepest descent is along the real axis, a 
conclusion that we could have reached more or less intuitively. 

Direct substitution into Eq. (7.77) with a = 0 gives 


V2jfs s+1 e- s 
|s(-l- 2 )| 1/2 ' 


(7.88) 


Thus, the first term in the asymptotic expansion of the factorial function is 


s! & V2jtss s e s . 


(7.89) 


This result is the first term in Stirling’s expansion of the factorial function. 
The method of steepest descent is probably the easiest way of obtaining this 
first term. If more terms in the expansion are desired, then the method of 
Section 10.3 is preferable. ■ 


In the foregoing example the calculation was carried out by assuming s to 
be real. This assumption is not necessary. We may show that Eq. (7.89) also 
holds when s is replaced by the complex variable w, provided only that the 
real part of w is required to be large and positive. 


Asymptotic limits of integral representations of functions are extremely im¬ 
portant in many approximations and applications in physics: 

Jc v\srm\ 

The saddle point method is one method of choice for deriving them and belongs 
in the tool kit of every physicist and engineer. 


EXERCISES 


7.3.1 Using the method of steepest descents, evaluate the second Hankel func¬ 
tion given by 

dz 


l r° 

H?\s) = — / e 1 

J—ocCo 


(s/2)(z-1/2)_ 


*w+l ’ 


with contour C 2 as shown in Fig. 7.16. 


ANS. pL e -i(.s-x/*-™/2\ 

V 71S 
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7.3.2 Find the steepest path and leading asymptotic expansion for the Fresnel 
integrals cos :x 2 dx, / () s sin.r 2 ri.:r. 

Hint. Use / ( ' e isz2 dz. 

7.3.3 (a) In applying the method of steepest descent to the Hankel function 

77™ (s), show that 


mm] < «[/(*&)] = o 


for z on the contour Ci but away from the point z = zo — i. 
(b) Show that 


and 


9t[/(s)] > 0 for 0 < r < 1, 


9i [/(«)] <0 for r > 1, 


JT 

— < 9 < TX 
2 

7t 


Jl Tt 

' 2 <9< 2 


(Fig. 7.17). This is why Ci may not be deformed to pass through the 
second saddle point z = —i. Compare with and verify the dot-dashed 
lines in Fig. 7.15 for this case. 


Figure 7.17 

Saddle Points of 
Hankel Function 



7.3.4 Show that Stirling’s formula 

s! « \f2jtss s e~ s 


holds for complex values of s [with JH(s) large and positive]. 
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Hint. This involves assigning a phase to s and then demanding that 
3[s/(s)] = constant in the vicinity of the saddle point. 

7.3.5 Assume f7®(s) to have a negative power-series expansion of the form 

OO 

,i(s-v(jr/2)-jr/4) ^ 
n =0 

with the coefficient of the summation obtained by the method of steepest 
descent. Substitute into Bessel’s equation and show that you reproduce 
the asymptotic series for given in Section 12.3. 



Additional Reading 


Wyld, H. W. (1976). Mathematical Methods for Physics. Benjamin/Cummings, 
Reading, MA. Reprinted, Perseus, Cambridge, MA. (1999). This is a rela¬ 
tively advanced text. 




Differential Equations 


8.1 Introduction 


In physics, the knowledge of the force in an equation of motion usually leads 
to a differential equation, with time as the independent variable, that gov¬ 
erns dynamical changes in space. Almost all the elementary and numerous 
advanced parts of theoretical physics are formulated in terms of differential 
equations. Sometimes these are ordinary differential equations in one variable 
(ODE). More often, the equations are partial differential equations (PDE) in 
combinations of space and time variables. In fact, PDEs motivate physicists’ 
interest in ODEs. The term ordinary is applied when the only derivatives 
dy/dx, d 1 2 y/dx 2 , ... are ordinary or total derivatives. An ODE is first order if 
it contains the first and no higher derivatives of the unknown function y(x), 
second order if it contains d 2 y/dx 2 and no higher derivatives, etc. 

Recall from calculus that the operation of taking an ordinary derivative is 
a linear operation (£)' 


d(a(p(x) + b\j/{x')') 
dx 


d<P_ ,djf_ 

a dx dx 


In general, 


C(aq> + b i/0 = aC(cp') + bC^i/s), 


( 8 . 1 ) 


where a and b are constants. An ODE is called linear if it is linear in the 
unknown function and its derivatives. Thus, linear ODEs appear as linear 
operator equations 

Cif = F, 


1 We are especially interested in linear operators because in quantum mechanics physical quantities 

are represented by linear operators operating in a complex, infinite dimensional Hilbert space. 
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where x/r is the unknown function or general solution, the source F is a known 
function of one variable (for ODEs) and independent of (/, and £ is a linear 
combination of derivatives acting on (/. If F ^ 0, the ODE is called inho¬ 
mogeneous; if F = 0, the ODE is called homogeneous. The solution of the 
homogeneous ODE can be multiplied by an arbitrary constant. If i jr v is a par¬ 
ticular solution of the inhomogeneous ODE, then x/fh = xjr — x/s p is a solution 
of the homogeneous ODE because C(xjs — \b v ) = F — F = 0. Thus, the gen¬ 
eral solution is given by i// = x\i v + xjj h . For the homogeneous ODE, any linear 
combination of solutions is again a solution, provided the differential equation 
is linear in the unknown function xlr h ; this is the superposition principle. 
We usually have to solve the homogeneous ODE first before searching for 
particular solutions of the inhomogeneous ODE. 

Since the dynamics of many physical systems involve second-order deriva¬ 
tives (e.g., acceleration in classical mechanics and the kinetic energy operator, 
~V 2 , in quantum mechanics), differential equations of second order occur 
most frequently in physics. [Maxwell’s and Dirac’s equations are first order but 
involve two unknown functions. Eliminating one unknown yields a second- 
order differential equation for the other (compare Section 1.9).] Similarly, any 
higher order (linear) ODE can be reduced to a system of coupled first-order 
ODEs. 

Nonetheless, there are many physics problems that involve first-order 
ODEs. Examples are resistance-inductance electrical circuits, radioactive 
decays, and special second-order ODEs that can be reduced to first-order 
ODEs. These cases and separable ODEs will be discussed first. ODEs of 
second order are more common and treated in subsequent sections, involving 
the special class of linear ODEs with constant coefficients. The impor¬ 
tant power-series expansion method of solving ODEs is demonstrated using 
second-order ODEs. 


8.2 First-Order ODEs 


Certain physical problems involve first-order differential equations. Moreover, 
sometimes second-order ODEs can be reduced to first-order ODEs, which then 
have to be solved. Thus, it seems desirable to start with them. We consider here 
differential equations of the general form 


dy 

dx 


= fix, y) = - 


Pjpo, y) 

Q(x, y)' 


( 8 . 2 ) 


Equation (8.2) is clearly a first-order ODE; it may or may not be linear, although 
we shall treat the linear case explicitly later, starting with Eq. (8.12). 


Separable Variables 


Frequently, Eq. (8.2) will have the special form 


dy 

dx 


= fix, y) = - 


P(x ) 

o¥y 


(8.3) 
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EXAMPLE 8.2.1 


EXAMPLE 8.2.2 


Then it may be rewritten as 

P(x)dx + Q(y)dy = 0. 

Integrating from (xq, yo) to (x, y) yields 
rx ry 

/ P(X)dX+ / Q(Y)dY — 0. (8.4) 

Jxo jyo 

Here we have used capitals to distinguish the integration variables from the 
upper limits of the integrals, a practice that we will continue without further 
comment. Since the lower limits xq and yo contribute constants, we may ignore 
the lower limits of integration and write a constant of integration on the right- 
hand side instead of zero, which can be used to satisfy an initial condition. Note 
that this separation of variables technique does not require that the differential 
equation be linear. 


Radioactive Decay The decay of a radioactive sample involves an event 
that is repeated at a constant rate X. If the observation time dt is small enough 
so that the emission of two or more particles is negligible, then the probability 
that one particle is emitted is /.dt, with Xdt <£ 1. The decay law is given by 


dN(t) 

dt 


—XN(t), 


(8.5) 


where N(t) is the number of radioactive atoms in the sample at time t. This 
ODE is separable 


dN/N = —Xdt 

and can be integrated to give 

In N = - Xt + In N 0 , or N(t) = N 0 e~ u , 


( 8 . 6 ) 

(8.7) 


where we have written the integration constant in logarithmic form for con¬ 
venience; Nq is fixed by an initial condition ;V(0) = No. ■ 


In the next example from classical mechanics, the ODE is separable but not 
linear in the unknown, which poses no problem. 


Parachutist We want to find the velocity of the falling parachutist as a 
function of time and are particularly interested in the constant limiting ve¬ 
locity, vo, that comes about by air resistance taken to be quadratic, —bv 2 , and 
opposing the force of the gravitational attraction, my, of the earth. We choose 
a coordinate system in which the positive direction is downward so that the 
gravitational force is positive. For simplicity we assume that the parachute 
opens immediately, that is, at time t = 0, where v(t = 0) = 0, our initial 
condition. Newton’s law applied to the falling parachutist gives 

mi) — mg — bv 2 , 

where m includes the mass of the parachute. 
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The terminal velocity no can be found from the equation of motion as t 
oo, when there is no acceleration, i) = 0, so that 

bv o = mg, or n 0 = yjmg/b. 

The variables t and n separate 

dn 

— dt, 


g — -v 2 

which we integrate by decomposing the denominator into partial fractions. 
The roots of the denominator are at v = ±no. Hence, 

-l 


9 - v‘ 

m 


Integrating both terms yields 

dV 


r 


m / 1 1 

2 v 0 b V v + v 0 v - v 0 


1 Pm , vo + v 
= - / — In-= t. 


9-i v2 2 \gb~~vo-v 

Solving for the velocity yields 

e 2t,T - 1 sinh j, t 

v = ^iT T -r = v ’^ShT = v ° tjmh f 


where T = ^ is the time constant governing the asymptotic approach of 

the velocity to the limiting velocity i>o. 

Putting in numerical values, g = 9.8 m/sec 2 and taking b = 700 kg/rn, 
m — 70 kg, gives no = VO.S/IO ~ 1 m/sec, ~ 3.6 km/hr, or ~2.23 nriles/hr, the 
walking speed of a pedestrian at landing, and T = | = l/VlO ■ 9.8 ~ 0.1 

sec. Thus, the constant speed i>o is reached within 1 sec. Finally, because it 
is always important to check the solution, we verify that our solution 
satisfies 

coshf/Tno sinlrf/Tno no v 2 b 2 

~ ~T ~ ~Tvq ~ ° f 


b 

: g - v" 

TO 


cosh t/T T cos h 2 t/TT 

that is, Newton’s equation of motion. The more realistic case, in which the 
parachutist is in free fall with an initial speed Vj = n(0) P 0 before the 
parachute opens, is addressed in Exercise 8.2.16. ■ 


Exact Differential Equations 

We rewrite Eq. (8.2) as 


P(x, y)dx+ Q(x, y)dy = 0. 


( 8 . 8 ) 


This equation is said to be exact if we can match the left-hand side of it to a 
differential dtp, 

dtp = —dx+ —dy. 
ox 3 y 


(8.9) 
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Since Eq. (8.8) has a zero on the right, we look for an unknown function 
<p(x, y) — constant and dtp = 0. 

We have [if such a function <p{x, y) exists] 


3i p 3 q> 

P(x, y)dx + Q(x, y)dy= — dx+—dy 

dx 3 y 


(8.10a) 


and 


3 tp 

— = P{x, y), 
dx 


dtp 

3 y 


Q(x, y). 


(8.10b) 


The necessary and sufficient condition for our equation to be exact is that the 
second, mixed partial derivatives of tp(x, y ) (assumed continuous) are inde¬ 
pendent of the order of differentiation: 


3 2 tp dP(x, y) 3Q{x, y) 3 2 tp 
dydx 3 y dx dxdy 


( 8 . 11 ) 


Note the resemblance to Eq. (1.124) of Section 1.12. If Eq. (8.8) corresponds 
to a curl (equal to zero), then a potential, tp(x, y), must exist. 

If tp{x, y) exists then, from Eqs. (8.8) and (8.10a), our solution is 


tp(x, y) = C. 


We may construct tp(pc, y) from its partial derivatives, just as we construct a 
magnetic vector potential from its curl. See Exercises 8.2.7 and 8.2.8. 

It may well turn out that Eq. (8.8) is not exact and that Eq. (8.11) is not 
satisfied. However, there always exists at least one and perhaps many more 
integrating factors, a {x, y), such that 


a{x, y)P{x, y)dx+ a{x, y)Q{x, y)dy = 0 


is exact. Unfortunately, an integrating factor is not always obvious or easy 
to find. Unlike the case of the linear first-order differential equation to be 
considered next, there is no systematic way to develop an integrating factor 
for Eq. (8.8). 

A differential equation in which the variables have been separated is auto¬ 
matically exact. An exact differential equation is not necessarily separable. 



Linear First-Order ODEs 

If f{x, y) in Eq. (8.2) has the form —p(x)y + q(x), then Eq. (8.2) becomes 


— + p{x)y = q{x). (8.12) 

dx 

Equation (8.12) is the most general linear first-order ODE. If q (x) = 0, Eq. 
(8.12) is homogeneous (in y). A nonzero q{x) may be regarded as a source 
or a driving term for the inhomogeneous ODE. In Eq. (8.12), each term is 
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linear in y or dy/dx. There are no higher powers, such as y 2 , and no products, 
such as y(dy/dx'). Note that the linearity refers to the y and dy/dx; p(x) and 
q(x) need not be linear in x. Equation (8.12), the most important for physics 
of these first-order ODEs, may be solved exactly. 

Let us look for an integrating factor a(x) so that 


dy 

a(x) -f- a(x)p(x)y — a(x)q(x) 

dx 


(8.13) 


may be rewritten as 


— [a(x)y] = a(x)q(x). 
ax 


(8.14) 


The purpose of this is to make the left-hand side of Eq. (8.12) a derivative so 
that it can be integrated by inspection. It also, incidentally, makes Eq. (8.12) 
exact. Expanding Eq. (8.14), we obtain 

c n(x)^~ + ^y = a(x)q(x). 
dx dx 

Comparison with Eq. (8.13) shows that we must require 


— = a(x)p(x). 
dx 


(8.15) 


Here is a differential equation for a(x), with the variables a and x separable. 
We separate variables, integrate, and obtain 


oi (x) — exp 


f 


p(X)dX 


(8.16) 


as our integrating factor. The lower limit is not written because it only multi¬ 
plies a and the ODE by a constant, which is irrelevant. 

With a(x) known we proceed to integrate Eq. (8.14). This, of course, was 
the point of introducing a in the first place. We have 


f 


^[a(X)y(X)]dX = J" a(X)q(X)dX. 


Now integrating by inspection, we have 

a(x)y(x) = J a(X)q(X)dX + C. 

The constants from a constant lower limit of integration are absorbed in the 
constant C. Dividing by a(x), we obtain 


y(x) = [a Or)] 


if 


a(X)q(X)dX+C\ 


Finally, substituting in Eq. (8.16) for a yields 


y(x) = exp 


- J p(X)dX J exp J 


p(Y)dY 


q(Z)dZ + C . (8.17) 
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Here the (dummy) variables of integration have been rewritten as capitals. 
Equation (8.17) is the complete general solution of the linear, first-order ODE, 
Eq. (8.12). The portion 


Vh (x) = C exp 



p(X)dX 


(8.18) 


corresponds to the case q(pc) = 0 and is a general solution of the homoge¬ 
neous ODE because it contains the integration constant. The other term in 
Eq. (8.17), 


Up 0*0 = exp 




p(Y)dY 


q(Z)dZ, (8.19) 


is a particular solution of the inhomogeneous ODE corresponding to 
the specific source term q(pc). 

Let us summarize this solution of the inhomogeneous ODE in terms of a 
method called variation of the constant as follows. In the first step, we solve 
the homogeneous ODE by separation of variables as before, giving 


u ! c x 

— = -p, In y=- p(X)dX + \nC, y(_x) = Ce~f KX)dX . 

V J 

In the second step, we let the integration constant become ^-dependent, that 
is, C -> C(x). This is the variation of the constant used to solve the inhomo¬ 
geneous ODE. Differentiating y(x) we obtain 


= - p c e -fpM dx + C’(x)e-! p{x)dx = -py(x) + C’ (x)e~ i Kx)doc . 


Comparing with the inhomogeneous ODE we find the ODE for C : 

C ’ e -fi*x)dx = <h or C (x) = J el XpmdY q(X~)dX. 

Substituting this C into y = C(x)e~ 1 f ,<x i dx reproduces Eq. (8.19). 

Now we prove the theorem that the solution of the inhomogeneous 
ODE is unique up to an arbitrary multiple of the solution of the homo¬ 
geneous ODE. 

To show this, suppose y\, y 2 both solve the inhomogeneous ODE [Eq. 
(8.12)]; then 

y[ - y ’ 2 + p(x)(y\ — 2 / 2 ) = 0 

follows by subtracting the ODEs and states that y\ — y> is a solution of the 
homogeneous ODE. The solution of the homogeneous ODE can always be 
multiplied by an arbitrary constant. ■ 


We also prove the theorem that a first-order linear homogeneous ODE 
has only one linearly independent solution. This is meant in the following 
sense. If two solutions are linearly dependent, by definition they satisfy 
ay\ (x) + by>{x) — 0 with nonzero constants a, b for all values of x. If the only 
solution of this linear relation is a = 0 = b, then our solutions and y 2 are 
said to be linearly independent. 
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To prove this theorem, suppose ?/i, y 2 both solve the homogeneous ODE. 
Then 

— = -p(x) = — implies W{x) = y[y> - 2 / 12/2 = 0. (8.20) 

m 2/2 

The functional determinant W is called the Wronskian of the pair 2 / 1 , y 2 - We 
now show that W = 0 is the condition for them to be linearly dependent. 
Assuming linear dependence, that is, 

ay\{x) + by 2 (x) = 0 

with nonzero constants a, b for all values of x, we differentiate this linear 
relation to get another linear relation 

ay[(x) + by! 2 (pc) = 0 . 

The condition for these two homogeneous linear equations in the unknowns 
a, b to have a nontrivial solution is that their determinant be zero, which is 
W = 0. 

Conversely, from W = 0, there follows linear dependence because we can 
find a nontrivial solution of the relation 

y[ = y2 

2/1 2/2 

by integration, which gives 

in 2/1 = In 2/2 + In C, or 2/1 = Cy 2 . 

Linear dependence and the Wronskian are generalized to three or more func¬ 
tions in Section 8.3. ■ 


EXAMPLE 8.2.3 


Linear Independence The solutions of the linear oscillator equation y" + 
or y(x) = 0 are 2/1 = sin a>x, y 2 = cos ojx, which we check by differentiation. 
The Wronskian becomes 


sin cox cos cox 
co cos cox — co sin cox 


= —co ^ 0. 


These two solutions, 2/1 and y 2 , are therefore linearly independent. For just two 
functions this means that one is not a multiple of the other, which is obviously 
true in this case. 

You know that 


sinoxr = ±(1 — cos 2 ft>^) 1/2 , 

but this is not a linear relation. ■ 

Note that if our linear first-order differential equation is homogeneous 
(q = 0), then it is separable. Otherwise, apart from special cases such as 
p = constant, q = constant, or q(x) = ap(x), Eq. (8.12) is not separable. 








418 


Chapter 8 Differential Equations 


Figure 8.1 

Circuit with 
Resistance R and 
Inductance L in 
Series 



EXAMPLE 8.2.4 


RL Circuit For a resistance-inductance circuit (Fig. 8.1 and Example 6.1.6) 
Kirchhoff’s first law leads to 


L dKt) 

dt 


+ RI(t) = V(t) 


( 8 . 21 ) 


for the current /(£), where L is the inductance and R the resistance, both 
constant. Here, V(1) is the time-dependent input voltage. 

From Eq. (8.16), our integrating factor a(t) is 


a(£) = exp J jdT = e m/L . 


Then by Eq. (8.17), 


I {t) = e~ Rt ' L 


j l e RT ,L V( p dT + C 


( 8 . 22 ) 


with the constant C to be determined by an initial condition (a boundary 
condition). 

For the special case V (f) = V 0 , a constant, 


m = e 


_ o-Ri/L 


V 0 L 
L ' R 6 


Rt/L 


+ C 


Vo 

= i +Ce ~ 


Rt/L 


For a first-order ODE one initial condition has to be given. If it is 7(0) = 0, 
then C = — Vo / R and 


m=^[l-e- Rt/L ]. ■ 


P ODEs of Special Type 

Let us mention a few more types of ODEs that can be integrated analytically. 
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EXAMPLE 8.2.5 


First-Order ODEs, with y/x Dependence The ODE y' — f(y/x) is not of 

the form of Eq. (8.12) in general but is homogeneous in y. The substitution 
z(x) — y(x)/x, suggested by the form of the ODE, leads via y' = xz' + z to 
the ODE xz' + z = f(z), which is not of the type in Eq. (8.12). However, it is 
separable and can be integrated as follows: 


, _ f(z) - z r dz 

x ’ J f(z)-z 


/ dx 
x 


\nx+ In C. 


An explicit case is the ODE 

/ 9 p . y oc 

xyy — y— x, or y = -. 

x y 

In terms of z(x) = y/x, we obtain xz' + z = z — *, or zdz = —dx/x, which 
has separated variables. We integrate it to get z 2 = C — 2 In x, where C is the 
integration constant. We check that our solution y = x^/C — 2 In x satisfies 

y'= 2Tnx — 1 / VC^2lmr, or — 1. 

X X^“ 

The constant C is determined by the initial condition. If, for example, y{ 1) = 1, 
we obtain (7 = 1. ■ 


Clairaut’s ODE y — xy' + f(y') can be solved in closed form despite the 
general nature of the function / in it. 

Replacing y' by a constant C, we verify that each straight line y = Cx + 
/(C) is a solution. The slope of each straight line coincides with the direction 
of the tangent prescribed by the ODE. A systematic method to find this class 
of solutions starts by setting y' = u(x) in the ODE so that y = xu+ f(u) with 
the differential udx — dy= udx + xdu+ f'(u)du. Dropping the udx term we 
find 

[x+ f'(u)]du = 0. 

Setting each factor equal to zero, du— 0 yields u— C = const, and the straight 
lines again. Next, eliminating u from the other factor set to zero, 

x + f'(u) = 0, and y = xu+ f(u) 

generates another solution of Clairaut’s ODE, a curve (x(u), y(ii) ) that no 
longer contains the arbitrary constant C. From y' — u, we verify y = xu + 
./'(«) = xy' + f(y'). The pair of coordinates x(u), y(u) given previously repre¬ 
sents a curve parameterized by the variable u; it represents the envelope of 
the class of straight lines y = Cx + /(C) for various values of C that are 
tangents to this curve. The envelope of a class of solutions of an ODE is called 
its singular solution; it does not involve an integration constant (and cannot 
be adapted to initial conditions). 

In general, geometric problems in which a curve is to be determined 
from properties of its tangent at x, y lead to Clairaut’s ODE as follows. 
The tangent equation is given by 

Y - y = y'(.X-x), or Y = y'X + (y - xy'\ 











420 


Chapter 8 Differential Equations 


where X, Y are the coordinates of the tangent and y' is its slope. A property of 
the tangent can be expressed as some functional relation F(y', y — xy') = 0. 
Solving this relation for y — xy' yields Clairaut’s ODE. Let us illustrate this by 
the following example. 


EXAMPLE 8.2.6 


Envelope of Tangents as Singular Solution of Clairaut’s ODE Deter¬ 
mine a curve so that the length of the line segment T\ T% in Fig. 8.2 cut out of its 
tangent by the coordinate axes X, Y is a constant a. Setting X = 0 in the previ¬ 
ous tangent equation gives the length 01) from the origin to T\ on the K-axis 
as y—xy', and setting Y = Ogives the 0 Tj length on the X-axis as (xy' — y)/y' ■ 
The right-angle triangle with comers 01\ T> yields the tangent condition 


{y - xy'f + 


(y - xy'f 

y'2 


= a 


or 


y = xy' ± 


ay' 

Vv ' 2 + 1 ’ 


a Clairaut ODE with the general solution y — xC ± -^== , which are straight 
lines. The envelope of this class of straight lines is obtained by eliminating u 
from 


y = xu ± 


au 


*Ju 2 + 1 ’ 


x±a I 


u 


y/u 2 + 1 Vrt 2 + 1 


= 0. 


The second equation simplifies to x± a a = 0. Substituting u = tan y yields 

l Q /-v , yJy?-\-l 

x± a cos' 3 <p = 0 and 


y = =p a cos 3 <p±a sin <p = ±a sin 3 <p 


from the first equation. Eliminating the parameter <p from x(yp~), y(yp) yields the 
astroid x 2,z + y 2/3 = a 2/3 , plotted in Fig. 8.2. ■ 


Figure 8.2 

Astroid as Envelope 
of Tangents of 
Constant Length 
TiT 2 = a 
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SUMMARY 


First-order differential equations will be discussed again in Chapter 15 in 
connection with Laplace transforms, in Chapter 18 with regard to the Euler 
equation of the calculus of variations, and in Chapter 19 with regard to 
nonlinear (Riccati and Bernoulli’s) ODEs. Numerical techniques for solving 
first-order differential equations are examined in Section 8.7. 

In summary, first-order ODEs of the implicit form F(x, y, y r ) = 0 (as dis¬ 
cussed in the context of Clairaut’s ODE) or explicit form y' = f(pc, y) contain 
the variable x, the unknown function y (pc), and its derivative = y'(x). The 
general solution contains one arbitrary constant, called the integration con¬ 
stant, which often is determined by an initial condition y(x o) = yn involving 
given constants xo, yo ■ Such ODEs are sometimes called initial value problems. 

Among the simplest ODEs are separable equations y' = f(x, y) = — of 
Section 8.2. Their general solution is obtained by the integration f* P(X)d,X + 
fy n Q(Y)dY = const. 

Closely related are the more general exact differential equations 

3 ip dip 

P(x, y)dx + Q(x, y)dy— dip — — dx-\ - dy 

ox oy 

with the integrability condition . If the integrability condition is not 

satisfied, a solution ip(pc, y) does not exist. In that case, one has to search for 
an integrating factor a(x, y) so that = 3 holds. 

Linear first-order equations y' + p(x)y = q(x) are common ODEs. The 
radioactive decay law and electrical circuits are prime examples. The homo¬ 
geneous ODE y 1 + py = 0 is separable and integrated first, yielding In y + 
j 1 pdX — InC; then the integration constant C -> C(x) is varied to find the 
solution of the inhomogeneous ODE. 


EXERCISES 

8.2.1 From Kirchhoffs law, the current I in an RC (resistance-capacitance) 
circuit [change L to C in Fig. 8.1 and remove V(t); that is, short out the 
circuit] obeys the equation 


(a) Find /(f). 

(b) For a capacitance of 10,000 piF charged to 100 V and discharging 
through a resistance of 1 mQ, find the current I for f = 0 and for 
t = 100 sec. 

Note. The initial voltage is IqR or Q/C, where Q = / 0 °° I(f)dt. 

8.2.2 The Laplace transform of Bessel’s equation ( n = 0) leads to 

(s 2 + Y)f'(s) + s/(s) = 0. 


Solve for /(s). 
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8.2.3 The decay of a population by catastrophic two-body collisions is 
described by 


for t > 0. This is a first-order, nonlinear differential equation. Derive 
the solution 



where to = (fciVo) -1 and No is the population at time t — 0. This implies 
an infinite population at t = —to, which is irrelevant because the initial 
value problem starts at t = 0 with N(0) = N 0 . 

8.2.4 The rate of a particular chemical reaction A + B -> C is proportional 
to the concentrations of the reactants A and B: 


dC(t ) 

—^ = a[A(0) - C(f)][5(0) - Cm, 


where A(0) — C ( f) is the amount of A left to react at time t, and similarly 
for B. 

(a) Find C(£) for A(0) ^ B( 0). 

(b) Find C(t) for A(0) = 5(0). 

The initial condition is that C(0) = 0. 

8.2.5 A boat coasting through the water experiences a resisting force pro¬ 
portional to v n , where v is the boat’s instantaneous velocity and n an 
integer. Newton’s second law leads to 


With v(t — 0) = Vo, x(t = 0) = 0, integrate to find v as a function of 
time and v as a function of distance. 


8.2.6 The differential equation 


P(x, y)dx + Q(x, y)dy — 0 


is exact. Verify that 



is a solution. 


8.2.7 The differential equation 


P(x, y)dx + Q(x, y)dy — 0 


is exact. If 



show that 


dw 

— = P(x, y), 
dx 


dw 

— = Q(x, y). 
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Hence, (p(oc, y) — constant is a solution of the original differential 
equation. 

8.2.8 Prove that Eq. (8.13) is exact in the sense of Eq. (8.8), provided that 
a(x) satisfies Eq. (8.15). 

8.2.9 A certain differential equation has the form 

f(_x)dx + g(x)h(y)dy = 0, 

with none of the functions f{pc), g(x), h(y) identically zero. Show that 
a necessary and sufficient condition for this equation to be exact is that 
g(x') = constant. 


8.2.10 Show that 


y(x) — exp 
is a solution of 


/ x if /*x r r 

pit)dt / exp / 


pitfdt 


qis)ds + C 


— + p(x)y{x) = q(x) 
dx 

by differentiating the expression for y(x) and substituting into the dif¬ 
ferential equation. 


8.2.11 The motion of a body falling in a resisting medium may be described 
by 

dv 

to— = mg — bv 
dt 

when the retarding force is proportional to the velocity, v. Find the 
velocity. Evaluate the constant of integration by demanding that 
i>(0) = 0. Explain the signs of the terms mg and bv. 


8.2.12 The rate of evaporation from a particular spherical drop of liquid (con¬ 
stant density) is proportional to its surface area. Assuming this to be the 
sole mechanism of mass loss, find the radius of the drop as a function 
of time. 


8.2.13 In the linear homogeneous differential equation 

dv 

— = —av 
dt 

the variables are separable. When the variables are separated the equa¬ 
tion is exact. Solve this differential equation subject to v(0) = Vq by the 
following three methods: 

(a) separating variables and integrating; 

(b) treating the separated variable equation as exact; and 

(c) using the result for a linear homogeneous differential equation. 

ANS. v(t ) = v 0 e~ at . 


8.2.14 Bernoulli’s equation, 


“77 + f(x)y = gif)y n 

dt 
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is nonlinear for n ^ 0 or 1. Show that the substitution u — y 1 n reduces 
Bernoulli’s equation to a linear equation. 

ANS. — + (1 — ri)f(x)u = (1 - n)g(x'). 
dx 

8.2.15 Solve the linear, first-order equation, Eq. (8.12), by assuming y(x) = 
u(x)v(x), where v(x) is a solution of the corresponding homogeneous 
equation [q (x) = 0], 

8.2.16 (a) Rework Example 8.2.2 with an initial speed Vi = 60 miles/hr, when 
the parachute opens. Find v(t). 

(b) For a skydiver in free fall (no parachute) use the much smaller 
friction coefficient b = 0.25 kg/m and to = 70 kg. What is the limiting 
velocity in this case? 

ANS. vo = 52 m/sec = 187 km/hr. 

8.2.17 The flow lines of a fluid are given by the hyperbolas xy — C = const. 
Find the orthogonal trajectories (equipotential lines) and plot them 
along with the flow lines using graphical software. 

Hint. Start from y' = tan a for the hyperbolas. 

8.2.18 Heat flows in a thin plate in the xy- plane along the hyperbolas xy = 
const. What are the lines of constant temperature (isotherms)? 

8.2.19 Solve the ODE y' = ay/x for real a and initial condition y(Q) = 1. 

8.2.20 Solve the ODE y' — y+ y 2 with 7/(0) = 1. 

Solve the ODE y' = with 7/(0) = 0. 

ANS. x{y) = e y - 1 - y. 

8.2.22 Find the general solution of y l3 — 4 xyy' + 8 y 1 = 0 and its singular 
solution. Plot them. 

ANS. y= C(x — C) 2 . The singular solution is y = ^x 3 . 


8.3 Second-Order ODEs 


Linear ODEs of second order are most common in physics and engineering 
applications because of dynamics: In classical mechanics the acceleration is 
a second-order derivative and so is the kinetic energy in quantum mechanics. 
Thus, any problem of classical mechanics, where we describe the motion of 
a particle subject to a force, involves an ODE. Specifically, a force or driving 
term leads to an inhomogeneous ODE. In quantum mechanics we are led to 
the Schrodinger equation, a PDE. We will develop methods to find particular 
solutions of the inhomogeneous ODE and the general solution of the 
homogeneous ODE, such as the variation of constants, power-series expan¬ 
sion, and Green’s functions. Special classes of ODEs are those with constant 
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coefficients that occur in RLC electrical circuits and harmonic oscillators in 
classical mechanics. The simple harmonic oscillator of quantum mechanics is 
treated in Chapter 13. Nonlinear ODEs are addressed in Chapter 19. We start 
this section with examples of special classes of ODEs. We use the standard 
notation 2 dy/dx = ?/', d 2 y/dx 2 = y". 

In Examples 8.3.1 and 8.3.2, we encounter a general feature. Because the 
solution of a second-order ODE involves two integrations, the general 
solution will contain two integration constants that may be adjusted 
to initial or boundary conditions. 

When one variable is missing in the ODE, such as x or y, the ODE can be 
reduced to a first-order ODE. 


EXAMPLE 8.3.1 


Second-Order ODEs, Missing Variable y If the unknown function y is 

absent from the ODE, as in 


y" = f(y', x), 


(8.23) 


then it becomes a first-order ODE for z(x) = y'(x), z' = f(z, x). If z(x, Ci) is 
a solution of this ODE depending on an integration constant Ci, then 



z(X, COdX+C 2 


is the general solution of the second-order ODE. 

A simple example is the ODE y" = y' with boundary conditions y(0) — 1, 
2 /(-oo) = 0. 

Setting z—y', we solve — z by integrating 


r z d,z 
J ~z 


In z = 



dX = x + In C\. 


Exponentiating we obtain 


z = C x e x 


dy 

dx 


Integrating again we find 


f 


y=C x I e x dX + C 2 = C ie x + C 2 . 


We check our solution by differentiating it twice: y' = C \e x , y" = C\e x = y'. 
The boundary conditions 2/(0) = 1, y{— oo) = 0 determine the integration 
constants C\, C 2 . They give C\ + C 2 — 1 and C ’2 = 0 so that C\ = 1 results, and 
the solution is y — e x . 

Another specific case is y" = y' 2 with initial conditions 2/(0) = 2, y' (0) = — 1. 


2 This prime notation y' was introduced by Lagrange in the late 18th century as an abbreviation 
for Leibniz’s more explicit but more cumbersome dy/dx. 
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We start by integrating z' = z 2 , or 

r z dz r x 

J -p = -l/z=J dx+c !. 

This yields z= y' = —. Integrating again we find 

y(x) — - ln(a; + Ci) + C 2 . 

Checking this solution gives y" = {x+C{)~ 2 = y' 2 . The initial conditions yield 
2 = — In Ci + C 2 , — 1 = —l/Ci so that Ci = 1, implying C 2 = 2. The solution 
is y = — ln(a:+ 1) + 2. 

A third case is the ODE y" = (xy' f. We solve z' — (xzf by separating 
variables: 



We have chosen the integration constant in this special cubic form so that we 
can factorize the third-order polynomial 

x 3 - Cf = (x- Ci)(x 2 + Cix + C\) 

and, in the ODE, 



decompose the inverse polynomial into partial fractions 

1 _ i i / i 1 

x 3 -C 3 ~ x-Ci + CiV3 \a:+ f(l + iV3) x+ f(l-iV3) 

Integrating the ODE yields the solution 
y(x) = —3 ln(x — Ci) + In C 2 

/o 

-i^-[ln(a;+ Ci(l + iV 3)/2) - ln(a;+ Ci(l - iV3)/2)]. ■ 

Ci 


EXAMPLE 8.3.2 


Second-Order ODEs, Missing Variable x If the variable x does not appear 
in the ODE, as in 


y" = f(.y', y), 


(8.24) 


then we seek a solution y' = z(y) instead of searching for y(x) directly. Using 
the chain rule we obtain 


y 


n 


dzdy dz 

~r = z l~ = /fe 2/). 

dydx dy 


which is a first-order ODE for z(y). If we can find a solution z(y, Ci), then we 
can integrate y' = z(jj) to get 


r v dY 
J z(Y, Cl) 



dX = x+C 2 - ■ 
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EXAMPLE 8.3.3 


EXAMPLE 8.3.4 


y" +./Kx)y' +g(y)y ' 2 = 0 This more general and nonlinear ODE is a combi¬ 
nation of the types treated in Examples 8.3.1 and 8.3.2 so that we try a product 
solution y ' = v(x)w(y) incorporating the previous solution types. Differenti¬ 
ating this ansatz (trial solution) and substituting into the ODE we And 


dw 

i — i 

dy' 


y" = v'w + v^—y' = v'w + v 2 w — = —fvw — gv 2 w 2 


dw 

dy 


Here, we divide by an overall factor v 2 w without loss of generality because we 
reject y' — 0 as a trivial solution. We can solve the resulting ODE 


v' + f(x)v 


dw 

dy 


+ g(y)w{y) = o 


by choosing vQjc) as a solution of the first-order ODE v' + f(x)v(x) = 0 from the 
first term alone and w(jj) as a solution of the first-order ODE yyj+gQy)wQy) — 0 
from the second term alone. Both ODEs can be solved by separating variables 

r dv r r w dw r y 

J ~Y = ~ J f(X)dX=\nv, j — = -J g(Y)dY = lnu>. 

Alternatively, integrating the ODE written in the form 

~ + f(%) + g(y)y' = o 

y' 

yields 

RX)dX+ f y g(Y)dY = C, 

where C is an integration constant. Exponentiating this result gives the same 
solution. 

Let us illustrate a more specific example: 



xyy" + yy' - %y r2 = o. 

where fix') = i and giy) = — | so that In v = — In a; + InCi [i.e., v(x) = < f] 
and In w = In y+ In C 2 [i.e., iv(y) = C 2 y ]. Therefore, y’ = CiC 2 y/x, which we 
integrate as 


In y = C\C 2 In ::c + In C 2 

so that finally y(x) = C 2 x c ' :>l , a power law that indeed satisfies the ODE. ■ 

Euler’s ODE Euler’s ODE, 

ax 2 y" + bxy' + cy = 0, (8.25) 

is a homogeneous linear ODE that can be solved with a power ansatz y=x p . 
This power law is a natural guess because the reduction of the exponent by 
differentiation is restored by the coefficients x, x 2 of the if and y" terms, each 
producing the same power. 
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Substituting y' — px p ~ l , y" = p(p — Y)x p ~ 2 into the ODE yields 
[ap(p — 1) + bp + c]x p — 0, 

an algebraic equation for the exponent but only for the homogeneous 
ODE. Now we drop the factor x p to find two roots p\ , p 2 front the quadratic 
equation. If both exponents pi are real, the general solution is 

Cix in + C 2 x k . 

If the exponents are complex conjugates p\ t2 = r ± iq, then the Euler 
identity for x iq = e lq ln ,r yields the general solution 

y(x) = x r \C\ cos (q ln x) + C 2 sin(q ln x)\. 

If there is a degenerate solution p\ = p 2 = p for the exponent, we approach 
the degenerate case by letting the exponents become equal in the linear com¬ 
bination (x p+e — x p )/s, which is a solution of the ODE for s —> 0. This may 
be achieved by slightly varying the coefficients a, b, c of the ODE so that the 
degenerate exponent p splits into p + e and p. Thus, we are led to differentiate 
x p with respect to p. This yields the second solution x p ln x and the general 
solution 


y = x p (C i + C -2 Inx). 

A specific example is the ODE 

x 2 y" + 3 xy' + y — 0 with p{p — 1) + 3p + 1 = 0 = (p + l) 2 

so that p = — 1 is a degenerate exponent. Thus, the solution is y(x) = — + 
C 2 —. ■ 

^ nr. 


EXAMPLE 8.3.5 


ODEs with Constant Coefficients ODEs with constant coefficients 

ay" + by' + cy = 0 (8.26) 


are solved with the exponential ansatz y — e px . This is a natural guess because 
differentiation reproduces the exponential up to a multiplicative constant y' — 
py and y" = p 2 y. Hence, substituting the exponential ansatz reduces the 
ODE to the quadratic equation 

ap 2 + bp + c = 0 


for the exponent. If there are two real roots p\, p 2 , then 

y = C x e PlX + C 2 e P2X 

is the general solution. If pi > 0, or p 2 > 0, we have an exponentially 
growing solution. When p\ < 0, and p 2 < 0, we have the overdamped solution 
displayed in Fig. 8.3. 
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Figure 8.3 

Topical Solution of 
ODE with Constant 
Coefficients: Two 
Negative Exponents 



Figure 8.4 

Topical Solution of 
ODE with Constant 
Coefficients: Two 
Complex Conjugate 
Exponents 


y 



If there are two complex conjugate roots, then Euler’s identity yields 

Pi t 2 = r ± iq , y{x ) = e rx (C\ cos qx + C 2 sin qx ) 

as the general oscillatory or underdamped solution (Fig. 8.4). 

If there is one degenerate exponent, we approach the degenerate case 
with two slightly different exponents p + s and p for s -» 0 in the solution 
[ P (p+e):> : _ e P x y e of the ODE. Again, as in Example 8.3.4, this leads us to dif¬ 
ferentiate e px with respect to p to find the second solution xe px , giving the 
general critically damped solution y = e px (C\ + Cox') for the double-root 
case (Fig. 8.5). See also Examples 15.8.1, 15.9.1, and 15.10.1 for a solution by 
Laplace transform. ■ 

Because ODEs with constant coefficients and Euler’s ODEs are linear in 
the unknown function y and homogeneous, we have used the superposition 
principle in Examples 8.3.4 and 8.3.5: If yi, y -2 are two solutions of the homo¬ 
geneous ODE, so is the linear combination C\ y\ + C-iVi with constants C), C 2 
that are fixed by initial or boundary conditions as usual. 

The same exponential form y(x) = e px leads to the solutions of nth-order 
ODEs 


Oo2/ (m) + aiy in 1} H-b dn-iy’ + dnV = 0 
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Figure 8.5 

Typical Solutions of 
ODE with Constant 
Coefficients: Two 
Equal Exponents 



with constant coefficients a* in terms of exponents p t that are roots of the 
polynomial equation 

ctoP n + aip n ~ l H-b On-iP + (hi — 0. 

The general solution is the linear combination 

n 

y(x) = Y bie PiX , 

i =1 

where the constants b, are determined by initial or boundary conditions. 

Other generalizations are coupled ODEs with constant coefficients. Several 
cases are treated in Chapter 19 (Examples 19.4.6-19.4.10) in the context of 
linear approximations to nonlinear ODEs. 


Inhomogeneous Linear ODEs and Particular Solutions 


We have already discussed inhomogeneous first-order ODEs, such as Eq. (8.12). 
The general solution y(x) = y h (x ) + y p (x) is a sum of the general solution y h 
of the homogeneous ODE and a particular solution y p of the inhomogeneous 
ODE, which can be immediately verified by substituting y into the inhomoge¬ 
neous ODE. This theorem generalizes to nth-order linear ODEs, the general 
solution being y(x) = y p (x) + XLi Cjyi(x), where yi are the independent 
solutions of the homogeneous ODE with constants c,;. The particular solution 
y p usually inherits its form from the driving term q(x) provided differentia¬ 
tions produce the same types of functions that q (x) contains. The next few 
examples are cases in point, where we treat special types of functions q{pc), 
such as power laws, periodic functions, exponentials, and their combinations. 


Inhomogeneous Euler ODE 

Let us look at the inhomogeneous Euler ODE with a power law driving term 


ax 2 y" + bxy' + cy— Dx d 


where the exponent d and strength D are known numbers. The power law is 
the natural form for the Euler ODE because each term retains its exponent. 
Substituting the ansatz y p = Ax d into the Euler ODE, we realize that each 
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term contains the same power x d , which we can drop. We obtain 
A[ad(d — 1) + bd + c] = D, 

which determines A provided d is not an exponent of the homogeneous ODE. 

If d is an exponent of the homogeneous ODE, that is, ad(d — 1) + bd + 
c = 0, then our solution y p = x d (A + B In x) is a linear combination of both 
contributions of the degenerate case in Example 8.3.4. Substituting this trial 
solution into Euler’s ODE yields 

Dx d = ax 2 y" + bxy' + cy 

= x d [ a(d — Y)dA + a(d — Y)dB\nx + a(2d — 1)5 + bdA 
+ bdB \nx + bB + cA + cB\nx\ 

= a; d (A[a(d — Y)d + bd + c] + B[a(2d — 1) + b] 

+ B[a(d — l)d + bd + c] In a;), 

where the terms containing a come from y p , those containing b from y' p , and 
those containing c from y p . Now we drop x d and use ad(d — 1) + bd + c = 0, 
obtaining 

D = B[a(2d- l) + 6], 

thereby getting B in terms of D, whereas A is not determined by the source 
term; A can be used to satisfy an initial or boundary condition. The source can 
also have the more general form x d (I) + 5 In ,r) in the degenerate case. 

For an exponential driving term 

ax 2 y" + bxy' + cy= De~ x , 

the powers of x in the ODE force us to a more complicated trial solution 
y v — e~ x a n x n . Substituting this ansatz into Euler’s ODE yields recursion 
relations for the coefficients a„ . Such power series solutions are treated more 
systematically in Section 8.5. Similar complications arise for a periodic driving 
term, such as sin cox, which shows that these forms are not natural for Euler’s 
ODE. 


Inhomogeneous ODE with Constant Coefficients 

We start with a natural driving term of exponential form 

ay" + by' + cy = De~ dx , 


where the strength D and exponent d are known numbers. We choose aparticu- 
lar solution y p = Ae~ dx of the same form as the source, because the derivatives 
preserve it. Substituting this y p into the ODE with constant coefficients a, b,c 
yields 

A[ad 2 — bd + c] — D, 

determining A in terms of D , provided d is not an exponent of the homogeneous 
ODE. 
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If the latter is the case, that is, ad 2 — bd + c = 0, we have to start from the 
more general form y p — e~ dx (A + Bx) appropriate for the degenerate case of 
Example 8.3.5. Substituting this y p into the ODE yields 

D = ad 2 (A + Bx) — 2 adB — bd(A + Bx) + bB + c(A + Bx) 

= A[ad 2 — bd + c] + Bx[ad 2 — bd + c] + B(b — 2 ad), 

where the terms containing a come from y p , those containing b from y' p and c 
from y p . Now we drop the terms containing ad 2 — bd + c — 0 to obtain 

B(b — 2 ad) = D, 

determining B in terms of D, while A remains free to be adjusted to an initial 
or boundary condition. 

A source term of polynomial form is solved by a particular solution of 
polynomial form of the same degree if the coefficient of the y term in the ODE 
is nonzero; if not, the degree of y increases by one, etc. 

Periodic source terms, such as cos cox or sin cox , are also natural and lead 
to particular solutions of the form y p = A cos cox + B sin cox, where both the 
sine and cosine have to be included because the derivative of the sine gives 
the cosine and vice versa. We deal with such a case in the next example. 


EXAMPLE 8.3.6 


Electrical Circuit Let us take Example 8.2.4, include a capacitance C and an 
external AC voltage V(t) = Vo sin cot in series to form an RLC circuit (Fig. 8.6). 
Here, the sin cot driving term leads to a particular solution y p ~ sin(o>f — cp), a 
sine shape with the same frequency co as the driving term. 

The voltage drop across the resistor is RI, across the inductor it is given by 
the instantaneous rate of change of the current L and across the capacitor 
it is given by Q/C with the charge Q(t) giving 

dl Q 

L -1- RI + — = Vq sin cot. 


Because I(t) = we differentiate both sides of this equation to obtain the 
ODE with constant coefficients 



R d l 

dt 


I 

C 


coV ,o coscot. 


Figure 8.6 

Electrical Circuit: 
Resistance, 
Inductance, and 
Capacitance in 
Series 
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Comparing this ODE with the harmonic oscillator ODE in classical mechanics, 
we see that the inductance L is the electrical analog of the mass, the resistance 
R is the analog of the damping, and the inverse of the capacitance 1/C is the 
analog of a spring constant, whereas the current I is the analog of the mechan¬ 
ical displacement x(f). The general solution of the homogeneous ODE is 

4 = C ie pit + C 2 e®*, 


where p — pi and p = p 2 are the roots of the quadratic equation 


P 


R 1 

— p H-= 0, 

L P LC 


p — — — ± —J R 2 
2 L 2 L xl 


4 L 
~C' 


Because of the dominant negative term —R/2L in p (note the negative sign 
in the radicand), 4 is a transient current that decays exponentially with 
time. 

We now look for the particular solution with the same harmonic form 
as the driving voltage I p = A cos cot + B sin cot. This is called the steady- 
state current with the same frequency as the input, which survives after 
a sufficiently long time {—pit i>> 1). This is seen from the general solution 
I = I p + 4- In this sense, the steady-state current is an asymptotic form, 
but it is a particular solution that is present from the initial time onward. We 
differentiate I p twice, substitute into the ODE, and compare the coefficients 
of the sin cot and cos cot terms. This yields 


-<w 2 L(A cos cot + B sin cot) + Rco(—As\ncot + B cos cot) 

+ — (Acoscof + B sin cot) — coVo cos cot 
C 


so that 


—co l LA + coRB - 


A 

C 


ft) Vn, —co 2 LB — coRA - 


B 

C 


= 0 . 


From the second of these equations we find 

A=~B^, S=coL 

R co C 

where S is defined as the reactance by electrical engineers. Substituting this 
expression A into the first equation yields 


B = 


V 0 R 


so that A = — - 


Vo S 


R z + S 2 “ R z + S 2 ' 

The steady-state current may also be written as 

Vo 


I p — 4 sin(ft)f — < p), 4 = j A 2 + B 2 

where 4 R 2 + S 2 is the impedance. ■ 


tan*,- B - R , 


More examples of coupled and nonlinear ODEs are given in Chapter 19, 
particularly Examples 19.4.6-19.4.10. 
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Finally, let us address the uniqueness and generality of our solutions. If 
we have found a particular solution of a linear inhomogeneous second- 
order ODE 

y" + P(x)y' + Q(x)y = f(x), (8.27) 

then it is unique up to an additive solution of the homogeneous ODE. To 

show this theorem, suppose y\, yi are two solutions. Subtracting both ODEs 
it follows that y\ — t /2 is a solution of the homogeneous ODE 

y" + P(x)y' + Q(x)y = 0 (8.28) 

because of linearity of the ODE in y, y', y" and f(x) cancels. ■ 


The general solution of the homogeneous ODE [Eq. (8.28)] is a linear 
combination of two linearly independent solutions. To prove this theorem 
we assume there are three solutions and show that there is a linear relation 
between them. The analysis will lead us to the generalization of the Wronskian 
of two solutions of a first-order ODE in Section 8.2. Therefore, now we consider 
the question of linear independence of a set of functions. 


Linear Independence of Solutions 


Given a set of functions, ipx, the criterion for linear dependence is the existence 
of a relation of the form 


= 0, (8.29) 

in which not all the coefficients A* are zero. On the other hand, if the only 
solution of Eq. (8.29) is k x = 0 for all a, the set of functions <p x is said to be 
linearly independent. In other words, functions are linearly independent if 
they cannot be obtained as solutions of linear relations that hold for all x. 

It may be helpful to think of linear dependence of vectors. Consider A, 
B, and C in three-dimensional space with A-BxC^O, Then no nontrivial 
relation of the form 


aA + bB + cC = 0 (8.30) 

exists. A, B, and C are linearly independent. On the other hand, any fourth 
vector D may be expressed as a linear combination of A, B, and C (see Section 
2.1). We can always write an equation of the form 

D — «A — bB — cC = 0, (8.31) 

and the four vectors are not linearly independent. The three noncoplanar 
vectors A, B, and C span our real three-dimensional space. 

Let us assume that the functions <p x are differentiable as needed. Then, 
differentiating Eq. (8.29) repeatedly, we generate a set of equations 

Y J k x <p' x (x) = 0, (8.32) 

= 0 , 


(8.33) 
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and so on. This gives us a set of homogeneous linear equations in which kx 
are the unknown quantities. By Section 3.1 there is a solution kx ^ 0 only if 
the determinant of the coefficients of the kx s vanishes for all values of x. This 
means that the Wronskian of <p\, (p 2 ,, <p m 




<Pl 

<P2 

<Pn 

W(</>1, (p 2 , ■ 

■ • j Vti) — 

v'l 


‘ <Pn 




«r ij • 



(8.34) 


a function of x, vanishes for all x. 


1. If the Wronskian is not equal to zero, then Eq. (8.29) has no solution other 
than kx = 0. The set of functions <px is therefore linearly independent. 

2. If the Wronskian vanishes at isolated values of the argument, this does not 
necessarily prove linear dependence (unless the set of functions has only 
two functions). However, if the Wronskian is zero over the entire range of 
the variable, the functions (fix are linearly dependent over this range. 3 


EXAMPLE 8.3.7 


Linear Dependence For an illustration of linear dependence of three func¬ 
tions, consider the solutions of the one-dimensional diffusion equation y" = y. 
We have tp\ — e x and q >2 = e~ x , and we add cp-?, — cosh x, also a solution. The 
Wronskian is 


e 

e 

e 


X 

X 

X 


cosher 
sink x 
cosh a; 


= 0 . 


The determinant vanishes for all x because the first and third rows are identi¬ 
cal. Hence, e x , e~ x , and cosh x are linearly dependent, and indeed, we have a 
relation of the form of Eq. (8.29): 


e x + e x — 2cosha; = 0 with kx ^ 0. ■ 


Now we are ready to prove the theorem that a second-order homogeneous 
ODE has two linearly independent solutions. 

Supposeyi, 2 / 2 , V 3 are three solutions ofthe homogeneous ODE [Eq. (8.28)]. 
Then we form the Wronskian Wj k = yj ?/( — y'jy k of any pair y : j, y k of them and 
recall that W' jk = yjy£ - y ! - y k . Next we divide each ODE by y, getting — Q on 
the right-hand side so that 

y ± + p y -l = -QW= y l + P y X 

Vj Vj Vk Vk 

Multiplying by yjy k , we find 

iy j y k -y]yk) + Piy j y;-y' j y k ) = o, or w' jk = -pw jk (8.35) 


3 For proof, see H. Lass (1957), Elements of Pure and Applied Mathematics , p. 187. McGraw-Hill, 
New York. It is assumed that the functions have continuous derivatives and that at least one of the 
minors of the bottom row of Eq. (8.34) (Laplace expansion) does not vanish in [a, 6], the interval 
under consideration. 
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SUMMARY 


for any pair of solutions. Finally, we evaluate the Wronskian of all three solu¬ 
tions expanding it along the second row and using the ODEs for the W^-: 


W = 


2/1 2/2 2/3 

2 A 2/2 2/s 

24" 2/2 2/s 


= - 2 /i ^ 3 + - 2/3^2 


= P(2/l'^23 - 2/2^13 + 2/3^12) = -i* 


2/i 2/2 2/3 

2/{ 2/2 2/3 

2/{ 2/2 2/3 


= 0 . 


The vanishing Wronskian, W = 0, because of two identical rows is the con¬ 
dition for linear dependence of the solutions yj. Thus, there are at most two 
linearly independent solutions of the homogeneous ODE. Similarly, one can 
prove that a linear homogeneous nth-order ODE has n linearly independent 
solutions yj so that the general solution y(x) = Cjyj(x) is a linear combi¬ 
nation of them. ■ 


Biographical Data 

Wronski, Jozef Maria. Wronski, a Polish mathematician (1778-1853) who 
changed his name from Hone, introduced the determinants named after him. 


In summary, second-order ODEs require two integrations and therefore con¬ 
tain two integration constants, and there are two linearly independent solu¬ 
tions. The general solution 2/p+O 2/i + c 2 2/2 of the inhomogeneous ODE consists 
of a particular solution y p and the general solution of the homogeneous ODE. 
If an ODE y" = f(y', y) does not contain the variable x, then a solution of the 
form y' = z(y) reduces the second-order ODE to a first-order ODE. An ODE 
where the unknown function y(x) does not appear can be reduced to first order 
similarly, and combinations of these types can also be treated. Euler’s ODE in¬ 
volving x 2 y", xy', y linearly is solved by a linear combination of the power x p , 
where the exponent p is a solution of a quadratic equation, to which the ODE 
reduces. ODEs with constant coefficients are solved by exponential functions 
e px , where the exponent p is a solution of a quadratic equation, to which the 
ODE reduces. 

EXERCISES 

8.3.1 You know that the three unit vectors x, y, and z are mutually perpen¬ 
dicular (orthogonal). Show that x, y, and z are linearly independent. 
Specifically show that no relation of the form of Eq. (8.30) exists for 
x, y, and z. 

8.3.2 The criterion for the linear independence of three vectors A, B, and 
C is that the equation 


aA + bB + cC — 0 
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[analogous to Eq. (8.30)] has no solution other than the trivial a = 
b = c — 0. Using components A = (A \, A 2 , A 3 ), and so on, set up the 
determinant criterion for the existence or nonexistence of a nontrivial 
solution for the coefficients a, b, and c. Show that your criterion is 
equivalent to the scalar product ABxC/0. 

8.3.3 Using the Wronskian determinant, show that the set of functions 

Off 1 ] 

l,-(n=l, 2,...,A0j 

is linearly independent. 

8.3.4 If the Wronskian of two functions y\ and y 2 is identically zero, show 
by direct integration that 


2/i = cy 2 ] 

that is, y\ and y 2 are linearly dependent. Assume the functions have 
continuous derivatives and that at least one of the functions does not 
vanish in the interval under consideration. 

8.3.5 The Wronskian of two functions is found to be zero at x = .Xq and all x 
in a small neighborhood of xq. Show that this Wronskian vanishes for 
all x and that the functions are linearly dependent. If Xq is an isolated 
zero of the Wronskian, show by giving a counterexample that linear 
dependence is not a valid conclusion in general. 

8.3.6 The three functions sin x, e x , and e~ x are linearly independent. No one 
function can be written as a linear combination of the other two. Show 
that the Wronskian of sin x, e x , and e~ x vanishes but only at isolated 
points. 


ANS. W = 4 sin x, 

W = 0 for x — ±mt, n = 0, 1, 2,.... 

8.3.7 Consider two functions q>\ = x and <p 2 = \x\ = .%'sgn.t; (Fig. 8.7). The 
function sgn x is the sign of a;. Since cp[= 1 and <p ' 2 — sgn.r;, W(<p \, (p 2 ) = 
0 for any interval including [—1, +1]. Does the vanishing of the Wron¬ 
skian over [—1, +1 ] prove that fp\ and <p 2 are linearly dependent? Clearly, 
they are not. What is wrong? 

8.3.8 Explain that linear independence does not mean the absence of any 
dependence. Illustrate your argument with y\ = cosh x and y 2 = e j: . 

8.3.9 Find and plot the solution of the ODE satisfying the given initial 
conditions: 

1. y" + By 1 - Ay = 0 with y(0 ) = 1, y'(0) = 0, 

2. y" + 2?/ - 3y = 0 with y(0) = 0, y\ 0) = 1, 

3. y" + 2y' + 3y = 0 with y(0) = 0, y'{ 0) = 1. 

8.3.10 Find the general solution of the ODEs in Exercise 8.3.9. 
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Figure 8.7 
x and |jr| 



8.3.11 Find and plot the solution of the ODE satisfying the given boundary 
conditions: 

1. y" + 3y' - \y = 0 with t/(0) = 1, y(po) = 0, 

2. y" + 2y' - 3y = 0 with y(0) = 1, y(—o o) = 0, 

3. y" + 4 y’ - \2y = 0 with 2/(0) = 1,2/(1) = 2. 

8.3.12 Find and plot a particular solution of the inhomogeneous ODE 

1. y" + 3 y’ — 4 y= sincwr, 

2. y" + 3t/' — 4?/ = cos axr. 

8.3.13 Find the general solution of the ODE x 2 y" + xy’ — n 2 ?_/ = 0 for integer 

n. 

8.3.14 Solve the ODE y" + 9y = 0 using the ansatz y’ = z(y) as the ODE 
does not contain the variable x. Compare your result with the standard 
solution of an ODE with constant coefficients. 

8.3.15 Find and plot a particular solution of the following ODEs and give 
all details for the general solution of the corresponding homogeneous 
ODEs 

1. y" + 3y — 2 cos x — 3 sin 2a;, 

2. y" + 4y' + 20 y — sina;+ cos a;, 

3. y" + y’ — 2y= e?/x. 

8.3.16 The sun moves along the .r-axis with constant velocity c ^ 0. A planet 
moves around it so that its velocity is always perpendicular to the 
radius vector from the sun to the planet, but no other force is acting 
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(i.e., no gravitational force). Show that Kepler’s area law is valid and 
the planet’s orbit is an ellipse with the sun in a focus. 

8.3.17 A small massive sphere is elastically coupled to the origin moving in a 
straight line through the origin in a massless glass tube that rotates at 
a constant angular velocity co around the origin. Describe the orbit of 
the mass if it is at r = a, r = 0 at time t — 0. 

ANS. The rosetta curve r = a cos Nip, N = ^{coq/co) 2 — 1 for 
cog = k/m > co 2 -, for a>o — co a circle, and for coq < co a 
hyperbolic cosine spiral r = a cosh nip, n = y/(l — (&>o/®) 2 - 

8.3.18 A charged particle of mass to and charge e is moving in a constant 
electric field in the positive ^-direction and a constant magnetic field 
in the positive 2 -direction. At time t = 0 the particle is located at the 
origin with velocity v in the y-direction. Determine the motion r(t) and 
orbits for the cases B = 0, E ^ 0; E — 0, B ^ 0; E ^ 0, B ^ 0, v = 0; 
E 0, B 0, v 0. Plot the orbits. 

8.3.19 Two small masses m\, m 2 are suspended at the ends of a rope of con¬ 
stant length L over a pulley. Find their motion Z; (i) under the influence 
of the constant gravitational acceleration g — 9.8 m/sec 2 . Discuss var¬ 
ious initial conditions. 

8.3.20 Find the general solution of the ODE x 2 y" — 4 xy' + 6 y = 14.%'~ 4 , show¬ 
ing all steps of your calculations. 

8.3.21 Find the steady-state current of the RLC circuit in Example 8.3.6 for 
R = 7Q, L = 10 H, C = 10~ 4 F, V = 220sin60f V. 

8.3.22 Find the transient current for Exercise 8.3.21. 


8.4 Singular Points 


In this section, the concept of a singular point or singularity (as applied to a 
differential equation) is introduced. The interest in this concept stems from 
its usefulness in (i) classifying ODEs and (ii) investigating the feasibility of a 
series solution. This feasibility is the topic of Fuchs’s theorem (Section 8.5). 
First, we give a definition of ordinary and singular points of ODEs. 

All the ODEs listed in Sections 8.2 and 8.3 may be solved for d 2 y/dx 2 . We 
have 


y" = fix, y, y'\ (8.36) 

Now, if in Eq. (8.36), y and y' can take on all finite values at x = x\> and y" 
remains finite, point x = Xo is an ordinary point. On the other hand, if y" 
becomes infinite for any finite choice of y and y 1 , point x = Xq is labeled a 
singular point. We need to understand if the solution y(pc 0 ) is still well defined 
at such a point. 
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Another way of presenting this definition of a singular point is to write our 
second-order, homogeneous, linear differential equation (in y) as 

y" + P{x)y' + Q{x)y = 0. (8.37) 

Now, if the functions P(x) and Q(x) remain finite at x = xo, point x = xq is 
an ordinary point. However, if P(x) or Q (x) (or both) diverges as x —»■ xo, 
point ;ro is a singular point. Using Eq. (8.37), we may distinguish between 
two kinds of singular points. 

1. If either P(x) or Q(x) diverges as x Xo but (x— xo)P{x) and (x—Xo fQ(x) 

remain finite as x —> .Xo, then x = .Xo is a regular or nonessential singular 
point. We shall see that a power series solution is possible at ordinary points 
and regular singularities. 

2. If P(x) diverges faster than l/(.r —.r 0 ) so that {x— Xq)P{x~) goes to infinity as 
x -x x 0 , or Q(x) diverges faster than l/(a; — xq ) 2 so that (x — x 0 ) 2 Q(x) goes 
to infinity as x —> xq, then point x — .Xo is an irregular or essential sin¬ 
gularity. We shall see that at such essential singularities a solution usually 
does not exist. 

These definitions hold for all finite values of xy . The analysis of point x -» oo 
is similar to the treatment of functions of a complex variable (Chapters 6 and 
7). We set x= 1 /z, substitute into the differential equation, and then let z —> 0. 
By changing variables in the derivatives, we have 

dy{x) _ dy{z ~ x ) dz =_1 dy{z~ x ) = ^ dy{z~ x ) 

dx dz dx x 2 dz dz 


d 2 y{x ) 
dx 2 


d 

dz 


dy{x)' 

dx 


dz 

dx 


= (- 2 2 ) 


dy{z 2 d 2 y{z J )' 

-2 z - zr --— 

dz dz 2 


- o ~ 3 dy< ^ z 1} | ~4 d 2 y{z J ) 


dz 


dz 2 


(8.39) 


Using these results, we transform Eq. (8.37) into 


,d 2 y o 
z i -4 + [ 2« 3 
dz* 


_i v ,dy 


■ z 2 P{z~ 1 )] 


Q{z~ 1 )y= 0. 


dz 


(8.40) 


The behavior at x = oo (z = 0) then depends on the behavior of the new 
coefficients 


2z-P{z-^ 


and 


Q{z~ l ) 
s 4 ’ 


asx-^ 0. If these two expressions remain finite, point x = oc is an ordinary 
point. If they diverge no more rapidly than 1 /z and l/z 2 , respectively, point 
x = oo is a regular singular point; otherwise, it is an irregular singular 
point (an essential singularity). 
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EXAMPLE 8.4.1 


Bessel Singularity Bessel’s equation is 

x 2 y" + xy' + ( x 2 - n 2 ~)y = 0. 


(8.41) 


Comparing it with Eq. (8.37) we have 

1 n 2 

P(x) = -, Q(x) =1- 2 ’ 

QQ 

which shows that point x = 0 is a regular singularity. By inspection we see 
that there are no other singular points in the finite range. As x —> oo (z —> 0), 
from Eq. (8.41) we have the coefficients 

2z — z 1 — n 2 z 2 

— and 

Since the latter expression diverges as z 4 , point x = oo is an irregular or 
essential singularity 

More examples of ODEs with regular and irregular singularities are dis¬ 
cussed in Section 8.5. ■ 


EXERCISES 

8.4. Show that Legendre’s equation has regular singularities at x = —1,1, 
and oo. 

8.4.2 Show that Laguerre’s equation, like the Bessel equation, has a regular 
singularity at x — 0 and an irregular singularity at x = oo. 

8.4.3 Show that the substitution 


converts the hypergeometric equation into Legendre’s equation. 


8.5 Series Solutions—Frobenius’s Method 


In this section, we develop a method of obtaining one solution of the linear, 
second-order, homogeneous ODE. The method, a power series expansion, 
will always work, provided the point of expansion is no worse than a regular 
singular point, a gentle condition that is almost always satisfied in physics. 

A linear, second-order, homogeneous ODE may be put in the form 

+ P(x)^- + Q(x)y = 0. (8.42) 

ax “ ax 

The equation is homogeneous because each term contains y(x) or a deriva¬ 
tive, and it is linear because each y, dy/dx, or d 2 y/dx 2 appears as the first 
power—and no products. 
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Here, we develop (at least) one solution of Eq. (8.42). In Section 8.6, we 
develop the second, independent solution. We have proved that no third, 
independent solution exists. Therefore, the most general solution of the 
homogeneous ODE, Eq. (8.42), may be written as 


y h (x) = ci ?/i (x) + ci t/ 2 0*0 (8.43) 

as a consequence of the superposition principle for linear ODEs. Our physical 
problem may involve a driving term and lead to a nonhomogeneous, linear, 
second-order ODE 

+ PQfi^r + Qi x ~)y = F ( x )- (8.44) 

dx 2 ax 

The function on the right, F(x ), represents a source (such as electrostatic 
charge) or a driving force (as in a driven oscillator). These are also explored 
in detail in Chapter 15 with a Laplace transform technique. Calling this a 
particular solution y p , we may add to it any solution of the corresponding 
homogeneous equation [Eq. (8.42)]. Hence, the most general solution of 
the inhomogeneous ODE [Eq. (8.44)] is 


2/0*0 = ci 2 /i (a;) + c 2 y 2 0*0 + y P (x). (8.45) 


The constants Ci and c 2 will eventually be fixed by boundary or initial condi¬ 
tions. 

For now, we assume that F(x) = 0—that our differential equation is homo¬ 
geneous. We shall attempt to develop a solution of our linear, second-order, 
homogeneous differential equation [Eq. (8.42)] by substituting in a power 
series with undetermined coefficients. Also available as a parameter is the 
power of the lowest nonvanishing term of the series. To illustrate, we apply 
the method to two important differential equations. First, the linear oscillator 
equation 

pi+co 2 y=0, (8.46) 

ax - 

with known solutions y = sin cox, cos cox. Now we try 

y(x) — x k (ao + a\x+ a 2 x 2 + a 2 x 3 H-) 

OO 

= J2 a ^ k+k > «o^O, (8.47) 

x=o 

with the exponent k and all the coefficients a } still undetermined. Note that k 
need not be an integer. By differentiating twice, we obtain 


dy 

dx 

d 2 y 

dx 2 


J2axik + X)x k+k ~\ 

A=0 

oo 

^ a x (k + X)(k + A - Y)x k+X ~ 2 . 

A=0 
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By substituting the series for y and y" into the ODE [Eq. (8.46)], we have 

OO OO 

J2 + kXk + X — l)x k+x - 2 +co 2 J2 = 0. (8.48) 

1=0 x=o 

From our analysis of the uniqueness of power series (Chapter 5) the coef¬ 
ficients of each power of x on the left-hand side of Eq. (8.48) must vanish 
individually. 

The lowest power of x appearing in Eq. (8.48) is x k ~ 2 , for X = 0 in the first 
summation. The requirement that the coefficient vanish 4 * yields 

aok(k — 1 ) = 0 . 

We had chosen do as the coefficient of the lowest nonvanishing term of the 
series [Eq. (8.48)]; hence, by definition, ao ^ 0. Therefore, we have 

k(k - 1) = 0. (8.49) 

This equation, coming from the coefficient of the lowest power of x, we call 
the indicial equation. The indicial equation and its roots are of critical im¬ 
portance to our analysis. The coefficient aiQc + l)k of x k ~ l must also vanish. 
This is satisfied if k = 0; if k = 1, then eq = 0. Clearly, in this example we must 
require that either k = 0 or k = 1. 

Before considering these two possibilities for k, we return to Eq. (8.48) and 
demand that the remaining coefficients, viz., the coefficient of x k+ ^(j > 0), 
vanish. We set X — j + 2 in the first summation and X = j in the second. (They 
are independent summations and X is a dummy index.) This results in 

a j+ 2(k + j + 2)(/c + j + 1) + co“o,j = 0 


or 


a j +2 = 


00 


(k + j + 2)(k + j + 1 ) 


(8.50) 


This is a two-term recurrence relation. 6 Given « /, we may compute a,j +2 and 
then (ij+ 4 , dj + 6 , and so on as far as desired. Note that for this example, if we 
start with do, Eq. (8.50) leads to the even coefficients a^, 0 . 4 , and so on and 
ignores d\, do, d&, and so on. Since aq is arbitrary if k = 0 and necessarily zero 
if k — 1, let us set it equal to zero (compare Exercises 8.5.3 and 8.5.4) and then 
by Eq. (8.50) 


d 3 = do = di = • • • = 0 , 

and all the odd-numbered coefficients vanish. The odd powers of x will actually 
reappear when the second root of the indicial equation is used. 


4 See the uniqueness of power series (Section 5.7). 

Recurrence relations may involve three or more terms; that is, a,j + 2, depending on a,j and 2, 

etc. An unusual feature is that it goes in steps of two rather than the more common steps of one. 
This feature will be explained by a symmetry of the ODE called parity. 
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Returning to Eq. (8.49), our indicial equation, we first try the solution k = 0. 
The recurrence relation [Eq. (8.50)] becomes 

<y 2 

dj +2 = - aj ^ ^ , (8.51) 


U + 2)(j + 1) 


which leads to 


a -2 — -o 0 = “' 2 !"®°’ 


co 


a> 


a 4 — -02^-^- — + ^jj«o, 
co 2 ft) 6 

06 = —a 4 -—- = — — «o, and so on. 
5-6 6! 


By inspection (or mathematical induction), 


.2 n 


and our solution is 

y(x) k=Q = o 0 


1 


02 n = (“I)' 

(cox) 2 


m v a 0, 

(2 to )! 

(<wr ) 4 (coa :) 6 


(8.52) 


= Oo cos ft).r. (8.53) 


2! 4! 6! 

If we choose the indicial equation root k = 1 [Eq. (8.49)], the recurrence 
relation becomes 

<w 2 

“ J+! = ~ % 0T3X7T2)- 

Substituting in j = 0, 2, 4, successively, we obtain 


(8.54) 


02 = —do 


0 4 = —02 


2-3 


4-5 


a) 

"3f 


Oq, 


dg = — CL\ 


6-7 


co 

■■ + gj-“o, 

ft ) 6 


and so on. 


Again, by inspection and mathematical induction, 


«2» =(-!)* 


cu 


2n 


(2n+ 1) 


Oq. 


(8.55) 


For this choice, /c = 1, we obtain 


3/0*0*=i = 


Oo 

ft) 


1 - 

(ftxr) 


(coir ) 2 (coir ) 4 (ftxr ) 6 

3! + 5! 7!~ 

(coa ;) 3 (ftxr ) 6 


3! 


5! 


(coo)' 

7! 


o 0 . 

= — sin cox. 

CO 


(8.56) 
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Figure 8.8 

Schematic Power Series 



SUMMARY 


To summarize this power series approach, we may write Eq. (8.48) schemati¬ 
cally as shown in Fig. 8.8. From the uniqueness of power series (Section 5.7), 
the total coefficient of each power of x must vanish all by itself. The require¬ 
ment that the first coefficient vanish leads to the indicial equation [Eq. (8.49)]. 
The second coefficient is handled by setting ai = 0. The vanishing of the coef¬ 
ficient of x k (and higher powers, taken one at a time) leads to the recurrence 
relation [Eq. (8.50)]. 

This series substitution, known as Frobenius’s method, has given us two 
series solutions of the linear oscillator equation. However, there are two points 
about such series solutions that must be strongly emphasized: 


• The series solution should always be substituted back into the differential 
equation, to see if it works, as a precaution against algebraic and logical 
errors. If it works, it is a solution. 

• The acceptability of a series solution depends on its convergence (including 
asymptotic convergence). It is quite possible for Frobenius’s method to 
give a series solution that satisfies the original differential equation, when 
substituted in the equation, but that does not converge over the region of 
interest. 


Expansion about Jtq 


Equation (8.47) is an expansion about the origin, Xq = 0. It is perfectly possible 
to replace Eq. (8.47) with 


OO 

3 / 0*0 = X a ^ x ~ a o ^ (8.57) 

x=o 

The point Xo should not be chosen at an essential singularity—or our Frobenius 
method will probably fail. The resultant series (xo an ordinary point or regular 
singular point) will be valid where it converges. You can expect a divergence 
of some sort when \x — Xq\ = \z s —Xq\, where z s is the closest singularity to Xo 
in the complex plane. 


Symmetry of ODE and Solutions 


Note that for the ODE in Eq. (8.46) we obtained one solution of even sym¬ 
metry, defined as y\{x) = ]j\ (—•%'), and one of odd symmetry, defined as 
y-i(x) — —v /2 (—■*')■ This is not just an accident but a direct consequence of the 
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form of the ODE. Writing a general ODE as 

C(x)y(x) = 0, (8.58) 

where C(x) is the differential operator, we see that for the linear oscillator 
equation [Eq. (8.46)], upon reversing the coordinate x -* —x (defined as parity 
transformation), 


£(x) = £(-ar) (8.59) 

is even under parity. Whenever the differential operator has a specific parity 
or symmetry, either even or odd, we may interchange +x and —x, and Eq. 
(8.58) becomes 


±C(x)y(-x) = 0. (8.60) 

It is + if C(x) is even and — if C(x) is odd. Clearly, if y(x) is a solution of 
the differential equation, y(—x) is also a solution. Then any solution may be 
resolved into even and odd parts, 

y(x) = ^ [2/0*0 + y{-x)\ + ^ [y(x) - y(-x)\, (8.61) 

the first bracket on the right giving an even solution and the second an odd 
solution. Such a combination of solutions of definite parity has no definite 
parity. 

Many other ODEs of importance in physics exhibit this even parity; that is, 
their P(x) in Eq. (8.42) is odd and Q(x) even. Solutions of all of them may be 
presented as series of even powers of x and separate series of odd powers of 
x. Parity is particularly important in quantum mechanics. We find that wave 
functions are usually either even or odd, meaning that they have a definite 
parity. For example, the Coulomb potential in the Schrodinger equation for 
hydrogen has positive parity. As a result, its solutions have definite parity. 


The power series solution for the linear oscillator equation was perhaps a 
bit too easy. By substituting the power series [Eq. (8.47)] into the differential 
equation [Eq. (8.46)], we obtained two independent solutions with no trouble 
at all. 

To get some idea of what can happen, we try to solve Bessel’s equation, 
x 2 y" + xy' + ( x 2 - re 2 )?/ = 0. 

Again, assuming a solution of the form 

OO 

y(x) = a k x k+k , 

1=0 


L imitations of Series Approach—Bessel s Equation 


(8.62) 
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we differentiate and substitute into Eq. (8.62). The result is 

OO OO 

ax(k + X)(k + X - T)x k+k + ^ ctxQc + X)x k+k 
x=o x=o 

OO OO 

+ ^2 ®>,% k+x+2 - ^2 a k n 2 x k+x = 0. (8.63) 

x=o A=0 

By setting X = 0, we get the coefficient of x k , the lowest power of x appearing 
on the left-hand side, 

ao[k(k — 1) + k — w 2 ] = 0, (8.64) 

and again ao ^ 0 by definition. Equation (8.64) therefore yields the indicial 
equation 

k 2 — n 2 = 0, (8.65) 


with solutions k = ±n. 

It is of interest to examine the coefficient of x k+l . Here, we obtain 
ai[(A: + l)Zc + k + 1 — m 2 ] = 0 


or 


a\(k + 1 - ri)(k + 1 + ri) = 0. 


( 8 . 66 ) 


For k = ±n, neither k + 1 — n nor k + 1 + n vanishes and we must require 

«i = 0.® 

Proceeding to the coefficient of .r ,r+jl for k = n, we set X = j in the first, 
second, and fourth terms of Eq. (8.63) and X = j — 2 in the third term. By 
requiring the resultant coefficient of x k+1 to vanish, we obtain 

ajl(n + f)(n+ j - 1) + (n + f) - n 2 ] + 2 = 0. 


When j is replaced by j + 2, this can be rewritten for j > 0 as 

1 


a j +2 — a j 


U + 2)(2n + J + 2) ’ 


(8.67) 


which is the desired recurrence relation. Repeated application of this recur¬ 
rence relation leads to 

1 aon\ 

a 2 = — do; 


= —a 2 


a 6 = — &4 


J 2(2n+ 2) 2 2 1!(m+1)!’ 

1 aow! 

4(2n + 4) = 2 4 2!(n+2)!’ 

1 aon\ 


6(2n + 6) 2®3!(n+3)!’ 


and so on, 


= ±n = — \ are exceptions. 
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and in general, 


a 2p = (- 1 )* 


do n\ 


2 2p p\(n+ p)\ 

Inserting these coefficients in our assumed series solution, we have 

, N . n\x 2 n\x 4 

y{x) — aox 

In summation form 


2 2 l!(n+1)! 2 4 2!(re+2)! 


( 8 . 68 ) 


(8.69) 


y(x) = aoE(-D J 


n\x n+2: > 


j =o 


2 VjXn+jJ. 


= a 0 2 n n\ £(-iy 


j =o 


jl(n + j)l \2 


x 


n+2j 


(8.70) 


In Chapter 12, the final summation is identified as the Bessel function J n (x). 
Notice that this solution J n (x ) has either even or odd symmetry,' as might be 
expected from the form of Bessel’s equation. 

When k = —n, and n is not an integer, we may generate a second distinct 
series to be labeled J- n (x). However, when —n is a negative integer, trouble 
develops. The recurrence relation for the coefficients % is still given by Eq. 
(8.67), but with 2 n replaced by —2 n. Then, when j + 2 = 2n or j — 2(n — 1), 
the coefficient a J+2 blows up and we have no series solution. This catastrophe 
can be remedied in Eq. (8.70), as it is done in Chapter 12, with the result that 


J-nipc) = (— 1)"<40*0, nan integer. (8.71) 


The second solution simply reproduces the first. We have failed to construct 
a second independent solution for Bessel’s equation by this series technique 
when n is an integer. 


SUMMARY 


By substituting in an infinite series, we have obtained two solutions for the 
linear oscillator equation and one for Bessel’s equation (two if n is not an 
integer). To the questions “Can we always do this? Will this method always 
work?” the answer is no. This method of power series solution will not always 
work, as we explain next. 


Biographical Data 

Frobenius, Georg. Frobenius, a German mathematician (1849-1917), con¬ 
tributed to matrices, groups, and algebra as well as differential equations. 


Regular and Irregular Singularities 


The success of the series substitution method depends on the roots of the indi- 
cial equation and the degree of singularity of the coefficients in the differential 
equation. To understand better the effect of the equation coefficients on this 


7 J n (x) is an even function if n is an even integer, and it is an odd function if n is an odd integer. 
For nonintegral n the x n has no such simple symmetry. 















8.5 Series Solutions—Frobenius’s Method 


449 


naive series substitution approach, consider four simple equations: 


y"-- 2 y = v, 

X z 

(8.72a) 

i 

& 

tes 

II 

o 

(8.72b) 

y" + -y'~ 

X 

a 2 

- ~^y = Q, 

(8.72c) 

y" + \y'- 

X A 

a 2 n 

- -o«/= 0. 

x 2 

(8.72d) 


You may show that for Eq. (8.72a) the indicial equation is 

k 2 - k - 6 = 0, 

giving k = 3, —2. Since the equation is homogeneous in x (counting d 2 /dx 2 as 
ar 2 ), there is no recurrence relation. However, we are left with two perfectly 
good solutions, x 3 and x~ 2 . 

Equation (8.72b) differs from Eq. (8.72a) by only one power of x, but this 
changes the indicial equation to 

—6a-o = 0, 

with no solution at all because we have agreed that ao ^ 0. Our series substi¬ 
tution worked for Eq. (8.72a), which had only a regular singularity, but broke 
down in Eq. (8.72b), which has an irregular singular point at the origin. 

Continuing with Eq. (8.72c), we have added a term y'/x. The indicial equa¬ 
tion is 

k 2 — a 2 = 0, 

but again there is no recurrence relation. The solutions are y = x", x~ a — 
both perfectly acceptable one-term series. Despite the regular singularity at 
the origin, two independent solutions exist in this case. 

When we change the power of x in the coefficient of y’ from —1 to —2 [Eq. 
(8.72d)], there is a drastic change in the solution. The indicial equation (with 
only the y' term contributing) becomes 

k = 0 . 


There is a recurrence relation 


Clj +1 — +ttj 


a 2 - jU ~ 1) 

3 + 1 


Unless the parameter a is selected to make the series terminate, we have 


lim 

j->oo 


®7+1 


= lim 


JU + T) 


j->oo j + 1 

f 

= lim — = oo. 

i^oo j 
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Hence, our series solution diverges for all x ^ 0. Again, our method worked 
for Eq. (8.72c) with a regular singularity but failed when we had the irregular 
singularity of Eq. (8.72d). 


Fuchs’s Theorem 

The answer to the basic question as to when the method of series substitution 
can be expected to work is given by Fuchs’s theorem, which asserts that we 

can always obtain at least one power series solution, provided we are 
expanding about a point that is an ordinary point or at worst a regular 
singular point. 

If we attempt an expansion about an irregular or essential singularity, our 
method may fail as it did for Eqs. (8.72b) and (8.72d). Fortunately, the more 
important equations of mathematical physics have no irregular singularities in 
the finite plane. Further discussion of Fuchs’s theorem appears in Section 8.6. 


Summary 


If we are expanding about an ordinary point or, at worst, about a regular 
singularity, the series substitution approach will yield at least one solution 
(Fuchs’s theorem). 

Whether we get one or two distinct solutions depends on the roots of the 
indicial equation: 

• If the two roots of the indicial equation are equal, we can obtain only one 
solution by this series substitution method. 

• If the two roots differ by a nonintegral number, two independent solutions 
may be obtained. 

• If the two roots differ by an integer, the larger of the two will yield a solution. 

The smaller may or may not give a solution, depending on the behavior of 
the coefficients. In the linear oscillator equation we obtain two solutions; for 
Bessel’s equation, only one solution is obtained. 

The usefulness of the series solution in terms of what is the solution (i.e., 
numbers) depends on the rapidity of convergence of the series and the avail¬ 
ability of the coefficients. Many ODEs will not yield simple recurrence relations 
for the coefficients. In general, the available series will probably be useful when 
\x\ (or \x—Xq[) is very small. Computers can be used to determine additional se¬ 
ries coefficients using a symbolic language, such as Mathematical Maple, 8 9 or 
Reduce. 10 Often, however, for numerical work a direct numerical integration 
will be preferred (Section 8.7). 


8 Wolfram, S. (1991). Mathematica, A System for Doing Mathematics by Computer. Addison 
Wesley, New York. 

9 Ileck, A. (1993). Introduction to Maple. Springer, New York. 

10 Rayna, G. (1987). Reduce Software for Algebraic Computation. Springer, New York. 
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EXERCISES 

8 . 5.1 Uniqueness theorem. The function y(x) satisfies a second-order, linear, 
homogeneous differential equation. At x — Xg, y(_x) = Vo and dy/dx = 
y' 0 . Show that y(x) is unique in that no other solution of this differential 
equation passes through the points (xo, ?7o) with a slope of y[ y 

Hint. Assume a second solution satisfying these conditions and com¬ 
pare the Taylor series expansions. 

8 . 5.2 A series solution of Eq. (8.47) is attempted, expanding about the point 
x = X(). If' .Xo is an ordinary point, show that the indicial equation has 
roots k — 0, 1. 

8 . 5.3 In the development of a series solution of the simple harmonic oscillator 
(SHO) equation the second series coefficient cq was neglected except 
to set it equal to zero. From the coefficient of the next to the lowest 
power of x, x k ~ l , develop a second indicial-type equation. 

(a) SHO equation with k = 0: Show that eq may be assigned any finite 
value (including zero). 

(b) SHO equation with k = 1: Show that cq must be set equal to zero. 

8 . 5.4 Analyze the series solutions of the following differential equations to 
see when cq may be set equal to zero without irrevocably losing any¬ 
thing and when eq must be set equal to zero. 

(a) Legendre, (b) Bessel, (c) Hermite. 


ANS. (a) Legendre and (c) Hermite: For k = 0, cq may be 
set equal to zero; for k = 1, cq must be set equal 
to zero. 

(b) Bessel: cq must be set equal to zero (except for 
k = ±n = — i). 


8 . 5.5 Solve the Legendre equation 

(1 - x 2 )y" - 2 xy’ + n(n + 1 )y = 0 


by direct series substitution and plot the solution for n = 0, 1, 2, 3. 
(a) Verify that the indicial equation is 

k(k - 1) = 0. 


(b) Using k = 0, obtain a series of even powers of x, (cq = 0). 

2/even — 1 ^0 


n(n+ 1) o n(n-2)(n+ l)(n+ 3) 4 
1 -—- x H --- x 


2 ! 


4! 


where 


jU + 1) — n(n+ 1) 

%• 


dj+2 = 


(i+l)(i + 2) 
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(c) Using k = 1, develop a series of odd powers of x(ci\ = 0). 


2/odd — tto 


0 - 1 ) 0 + 2 ) ^3 

x g! x 


O - 1)0 - 3)0 + 2)(n + 4) 5 

+ -6!-* + "'J' 

where 

_ U + 1)0' + 2) - n(n+ 1) 

“ i+2 “ 0 + 2)0+ 3) a ‘- 

(d) Show that both solutions, y eve n and v/ (K ]d, diverge for x — ±1 if the 

series continue to infinity. 

(e) Finally, show that by an appropriate choice of n, one series at a 
time may be converted into a polynomial, thereby avoiding the 
divergence catastrophe. In quantum mechanics this restriction 
of n to integral values corresponds to quantization of angular 
momentum. 


8 . 5.6 Develop series solutions for Hermite’s differential equation 

(a) y" — 2xy' + 2 ay — 0. 

ANS. k(k — 1) = 0, indicial equation. 

For k — 0 


a j+2 =2 a,- — 


j -a 


(j + l)(i + 2) 


U even), 


2/even — ^0 

For k = 1 


2(-a)x 2 2 2 (—a)(2 - a)x 4 


aj+2 = 2aj — 


2 ! 

j + 1 — OL 


+ 


4! 


(j + 2)(j + 3) 


(j even), 


?/even — ^0 


2(1 - <x)x 3 2 2 (1 - a)(3 - a)x 5 

ir H-—-1- 


3! 


5! 


(b) Show that both series solutions are convergent for all x, the ratio 
of successive coefficients behaving, for large index, like the corre¬ 
sponding ratio in the expansion of exp(2.x 2 ). 

(c) Show that by appropriate choice of a the series solutions may be 
cut off and converted to finite polynomials. (These polynomials, 
properly normalized, become the Hermite polynomials in Section 
13.1.) 


8 . 5.7 Laguerre’s ODE is 

xL" n (x) + (1 - x^L^Qx) + nL n (x ) = 0. 

Develop a series solution selecting the parameter n to make your series 
a polynomial and plot the partial series for the three lowest values of 
n and enough terms to demonstrate convergence. 
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8 . 5.8 A quantum mechanical analysis of the Stark effect (parabolic coordi¬ 
nates) leads to the differential equation 


El; 


TO 2 1 . 

■ a - Ft-' 

41 4 5 


u— 0 , 


d ( du\ 

dSVd$)'' \2 

where a is a separation constant, E is the total energy, and F is a con¬ 
stant; Fzis the potential energy added to the system by the introduction 
of an electric field. 

Using the larger root of the indicial equation, develop a power series 
solution about £ = 0. Evaluate the first three coefficients in terms of 
cio, the lowest coefficient in the power series for u(A) below. 


Indicial equation k~ 


mr 


= 0 , 


<£)= a 0 £ m/2 


1 - 


m + 1 


E 


r 


_ 2 (to + l)(m + 2) 4 (m + 2) _ 

Note that the perturbation F does not appear until a 3 is included. 

8 . 5.9 For the special case of no azimuthal dependence, the quantum mechan¬ 
ical analysis of the hydrogen molecular ion leads to the equation 


d 

dii 


' 2 du 

(X-fl) — 
dij 


+ cra + = 0. 


Develop a power series solution for Evaluate the first three non¬ 
vanishing coefficients in terms of a 0 . 


Indicial equation k(k — 1) = 0. 


Uk=\ =0*011 


1 + 


2 — 0! n 

-T’ 1 + 


(2 - o0(12 - «) P 
120 20 



8 . 5.10 To a good approximation, the interaction of two nucleons may be 
described by a mesonic potential 


Ae-™ 

* 5 

X 

attractive for A negative. Develop a series solution of the resultant 
Schrodinger wave equation 


d 2 \j/ 


+ (E — V)\jf — 0 


2 m dx 2 

through the first three nonvanishing coefficients: 


V r fc=l = 0*0 


1 ,, 2 1 
x+ - Ax 2 + - 
2 b 


-A' 2 — E’ — a A' 
2 


x 


where the prime indicates multiplication by 2 m/fi 2 . Plot the solution 
for a — 0.7 fnr 1 and A = —0.1. 
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8 . 5.11 Near the nucleus of a complex atom the potential energy of one electron 
is given by 

Ze 2 9 

V =- (1 + b\r + bir 2 ), 

r 

where the coefficients h\ and bo arise from screening effects. For the 
case of zero angular momentum determine the first three terms of the 
solution of the Schrodinger equation; that is, write out the first three 
terms in a series expansion of the wave function. Plot the potential and 
wave function. 

8 . 5.12 If the parameter a 2 in Eq. (8.72d) is equal to 2, Eq. (8.72d) becomes 

,, 1,2 
y"+—y'-—y= 0. 
oc “ oc^ 

From the indicial equation and the recurrence relation, derive a solu¬ 
tion y= 1 + 2x + 2x 2 . Verify that this is indeed a solution by substituting 
back into the differential equation. 


8.6 A Second Solution 


In Section 8.5, a solution of a second-order homogeneous ODE was developed 
by substituting in a power series. By Fuchs’s theorem this is possible, pro¬ 
vided the power series is an expansion about an ordinary point or a nonessen¬ 
tial singularity. 11 There is no guarantee that this approach will yield the two 
independent solutions we expect from a linear second-order ODE. Indeed, the 
technique gives only one solution for Bessel’s equation (n an integer). In this 
section, we develop two methods of obtaining a second independent solution: 
an integral method and a power series containing a logarithmic term. 

Returning to our linear, second-order, homogeneous ODE of the general 
form 

y" + P(x)y' + Q(x)y=0, (8.73) 

let yi and <y 2 be two independent solutions. Then the Wronskian, by definition, 
is 

w = 2 / 12/2 - y[y 2 - (8-74) 

By differentiating the Wronskian, we obtain 

w' = y[y ' 2 + 2 / 12/2 - y”y 2 - ViVi 

= y\[~P{x)y' 2 - Q0r)2/ 2 ] - y 2 [~P(x)y[ - Q(x)yi] 

= -P(x)(y x y' 2 - y[y 2 \ (8.75) 

The expression in parentheses is just W, the Wronskian, and we have 

W' = —P(x)W. (8.76) 


11 This is why the classification of singularities in Section 8.4 is of vital importance. 
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In the special case that P(x) = 0, that is, 

y" + Q(x)y = 0, 


(8.77) 


the Wronskian 


W = y/! y' 2 — y[yi = constant. 


(8.78) 


Since our original differential equation is homogeneous, we may multiply the 
solutions 2/1 and 2/2 by whatever constants we wish and arrange to have the 
Wronskian equal to unity (or — 1). This case, P(x) = 0, appears more frequently 
than might be expected. Recall that V 2 in Cartesian coordinates contains 
no first derivative. Similarly, the radial dependence of V 2 (ri//) in spherical 
polar coordinates lacks a first derivative. Finally, every linear second-order 
differential equation can be transformed into an equation of the form of Eq. 
(8.77) (compare Exercise 8.6.3). 

For the general case, let us assume that we have one solution of Eq. (8.73) 
by a series substitution (or by guessing). We now proceed to develop a second, 
independent solution for which W ^ 0. Rewriting Eq. (8.76) as 


dW 

~w 


—P dx \, 


we integrate from x\ = a to x\ = x to obtain 


W(pc) — W(a)exp | — f P(x{)dx 1 


However, 

By combining Eqs. (8.79) and (8.80), we have 

±(y*\ W(a) - fa P(pci)dx^\ 

dx \Di) yf(x) 


(8.79) 


(8.80) 


(8.81) 


Finally, by integrating Eq. (8.81) from X 2 = b to xy — x we get 


2/2 (a:) = 


yi(x)W(a) 


r 


eX P [ ~ fa' P(%l)dXi] 
[yi(X2)f 


dx 2 , 


(8.82) 


12 If P{x 1 ) remains finite, a < X\ < x, W(x) ^ 0 unless W(a) = 0. That is, the Wronskian of ora 
two solutions is either identically zero or never zero. 
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where a and b are arbitrary constants and a term y\ (x:)yi{h) j y\ (b) has been 
dropped because it leads to nothing new. Since W (a), the Wronskian evaluated 
at x = a, is a constant and our solutions for the homogeneous differential 
equation always contain an unknown normalizing factor, we set W(a) — 1 and 
write 

C x exp [ — f '* 2 P(x\)dx i] 

m(x) = yi(x) / L r r dx 2- ( 8 -83) 

J [2/1 te )] 2 

Note that the lower limits X\ = a and x 2 = b have been omitted. If they are 
retained, they simply make a contribution equal to a constant times the known 
first solution, y\ (x), and hence add nothing new. 

If we have the important special case of P(pc) = 0, Eq. (8.83) reduces to 

/ **' dbXo 

, f „ 2 - (8.84) 

[ 2 / ife )] 2 

This means that by using either Eq. (8.77) or Eq. (8.78), we can take one known 
solution and by integrating can generate a second independent solution of Eq. 
(8.73). As we shall see later, this technique to generate the second solution 
from the power series of the first solution y\(x ) can be tedious. 


EXAMPLE 8.6.1 


A Second Solution for the Linear Oscillator Equation From d 2 y/dx 2 + 
y = 0 with P(x) — 0, let one solution be y\ = sin a;. By applying Eq. (8.84), we 
obtain 


yiix) — sin a: 



dx 2 
sin 2 x 2 


— sina;(— cot a:) = — cos a;, 


which is clearly independent (not a linear multiple) of sin a;. ■ 



Series Form of the Second Solution 

Further insight into the nature of the second solution of our differential equa¬ 
tion may be obtained by the following sequence of operations: 


1. Express P(x) and Q(x) in Eq. (8.77) as 

OO OO 

P(x) = Y] Pi xi > Q(a0 = Y q i xi ' (8 - 85) 

i=—1 j=-2 

The lower limits of the summations are selected to create the strongest 
possible regular singularity (at the origin). These conditions just satisfy 
Fuchs’s theorem and thus help us gain a better understanding of that 
theorem. 

2. Develop the first few terms of a power series solution, as in Section 8.5. 

3. Using this solution as y\, obtain a second series-type solution, y 2 , with Eq. 
(8.77), integrating term by term. 









8.6 A Second Solution 


457 


Proceeding with step 1, we have 

y" + (p-iar 1 + p 0 + P\X H- )y' + (g_ 2 ar 2 + q~ix~ l H- )y — 0, (8.86) 

in which point x — 0 is at worst a regular singular point. If p_i = q_\ = g_ 2 = 0, 
it reduces to an ordinary point. Substituting 

OO 

y=J2 a> - xk+> ' 

a=o 

(step 2), we obtain 

OO OO OO 

(k +X)(k + A - 1 )axk k+x ~ 2 + ^ pix 1 ^(A; + A)ax^ fe+A_1 

^=0 i =-1 A =0 

OO OO 

+ ^ qjx j = 0. (8.87) 

j=-2 X=0 

Our indicial equation is 


k(k - 1 ) + p-ik + q-2 = 0 , 

which sets the coefficient of x k ~ 2 equal to zero. This reduces to 

k 2 + (p -1 - 1 )/c + g _ 2 = 0 . ( 8 . 88 ) 

We denote the two roots of this indicial equation by k = a and k = a — n, 
where n is zero or a positive integer. (If n is not an integer, we expect two 
independent series solutions by the methods of Section 8.6 and we are done.) 
Then 


(k — a)(k — a + n) = 0 


or 


k 2 + (n — 2 a)k + a(a — ri) = 0, (8.89) 

and equating coefficients of k in Eqs. ( 8 . 88 ) and (8.89), we have 

P- 1 — 1 = n - 2a. (8.90) 


The known series solution corresponding to the larger root k = a may be 
written as 

OO 

y 1 = x 01 ^ a } x k . 
x=o 


Substituting this series solution into Eq. (8.77) (step 3), we are faced with the 
formidable-looking expression, 


2/2 0*0 = 



eX P ( - fa 1 IX-1 P &1 <***) 

*f"(££= 0V 2 f 


dX2, 


(8.91) 
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EXAMPLE 8.6.2 


where the solutions y\ and y-z have been normalized so that the Wronskian 
W(a) — 1. Handling the exponential factor first, we have 



OO 

PiX i dx\ = p_i lnx 2 + y, 

/c=0 


Pk x k +1 

fc +1 2 


+ /(«), 


(8.92) 


where /(a) = — p_ i in a is an integration constant from the i = — 1 term that 
leads to an unimportant overall factor and can be dropped. Hence, 


exp 


J y pix\dx\ \ — exp[- 


■/(a)]x 2 p 1 exp 


£ 

k =0 


& r *+l 

fc + i 2 


= exp[-/(a)]x 2 


y -^x 2 fc+i + - f- y —^ +1 ^| 

2 2! y £g*+l - ) 


(8.93) 


This final series expansion of the exponential is certainly convergent if the 
original expansion of the coefficient P(x) was convergent. 

The denominator in Eq. (8.91) may be handled by writing 


-i -i 


x. 


2 a 


£ a iA 


\x=o 


= x 2 2 " ( £ ) = ^2 2 “ £ bkX 2 . (8-94) 


u=o 


A=0 


provided ao 7 ^ 0. Neglecting constant factors that will be picked up anyway by 
the requirement that W (a) = 1, we obtain 


y -2 0*0 = 2 / 10*0 / x 


J x 2 p 1 2 " /y c^xA dx 2 . 


By Eq. (8.90), 


x 


,-p_l-2a _ -B-l 


= X. 


2 > 


(8.95) 


(8.96) 


where n > 0 is an integer. Substituting this result into Eq. (8.95), we obtain 


2/2 0*0 = 2 / 10*0 


(coxy 1 + C 1 X 2 " + C 2 X 2™ +1 H-h c m X 2 1 H-)dx 2 . (8.97) 


The integration indicated in Eq. (8.97) leads to a coefficient of 2/1 (x) consisting 
of two parts: 


1. A power series starting with x~ n . 

2. A logarithm term from the integration of x ” 1 (when /. = n). This term always 
appears when n is an integer unless c n fortuitously happens to vanish . 13 


A Second Solution of Bessel’s Equation From Bessel’s equation, Eq. 
(8.62) [divided by x 2 to agree with Eq. (8.73)], we have 

P(x) = x _1 Q(x) = 1 for the case n = 0. 

13 For parity considerations, in x is taken to be in \x\, even. 
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Hence, p _\ = 1 ,qo= 1 in Eq. (8.85); all other p, and qj vanish. The Bessel 
indicia! equation is 


k 2 = 0 


[Eq. (8.65) with n — 0]. Hence, we verify Eqs. (8.88)-(8.90) with n and a = 0. 

Our first solution is available from Eq. (8.69). Relabeling it to agree with 
Chapter 12 (and using «o = 1), we obtain 14 

2 4 

yi(x) = J 0 (x) = 1 - - 0(x 6 \ (8.98a) 

4 64 

valid for all x because of the absolute convergence of the series. Now, substi¬ 
tuting all this into Eq. (8.83), we have the specific case corresponding to Eq. 
(8.91): 


y 2 {x) = J 0 (_x) 


r 


exp [ - / X2 x l 1 dx i] 


[1 — ^1/4 + arf/64 — •••]' 
From the numerator of the integrand 

1 


dx 2 . 


(8.98b) 


exp 



r X2 dx 1 " 


J OCX _ 


= exp[- lna: 2 ] = 


X 2 


This corresponds to the x 2 p 1 in Eq. (8.93). From the denominator of the inte¬ 
grand, using a binomial expansion, we obtain 


9 4 ■ 

iAj n Jb 9 

1 -- + — 

4 64 


Xn 


5Xn 


= 1 +-^ + ^ + 


32 


Corresponding to Eq. (8.85), we have 


y 2 (x) = J 0 (X) / — 
x 2 


r 1 

1 / — 

J X 2 


iy*2 Fx 4 

d/n o 

1 + — H-- 

2 32 


/y>2 C sy»4 

= J 0 (_x)\\nx+ — + — + 


dx 2 


(8.98c) 


Let us check this result. From Eqs. (12.60) and (12.62), which give the 
standard form of the second solution, 


2 2 

YoQxf) = — [ln;r — ln2 + y] Jo0*0 H— 

7T 7T 


ar 

"4 



(8.98d) 


Two points arise. First, since Bessel’s equation is homogeneous, we may mul¬ 
tiply y 2 (x) by any constant. To match Y 0 (x), we multiply our y 2 (x) by 2/n. 
Second, to our second solution ( '2/n)y 2 (x ), we may add any constant multiple 


14 The capital O (order of) as written here means terms proportional to x 6 and possibly higher 
powers of x. 
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SUMMARY 


of the first solution. Again, to match E (.'*') we add 

-[—In 2 + y]Jo(jx), 

7t 

where y is the Euler-Mascheroni constant (Section 5.2). 15 
second solution is 

2 2 

2 /20*0 = —[Ina; - ln2 + y]Ja(x) + — Jo(x) 

TT TT 



Our new, modified 


(8.98e) 


Now the comparison with EC*') becomes a simple multiplication of Jo 0*0 from 
Eq. (8.98a) and the curly bracket of Eq. (8.98c). The multiplication checks, 
through terms of order x 2 and a? 4 , which is all we carried. Our second solu¬ 
tion from Eqs. (8.83) and (8.91) agrees with the standard second solution, the 
Neumann function, Fo 0*0- ■ 


From the preceding analysis, the second solution of Eq. (8.83), y-Ax), may 
be written as 

OO 

y- 2 (a;) = y\(x)\nx+ ^ djx J+ “, (8.98f) 

j=—n 

the first solution times In x and another power series, this one starting with 
x a ~ n , which means that we may look for a logarithmic term when the indicial 
equation of Section 8.5 gives only one series solution. With the form of the 
second solution specified by Eq. (8.98f), we can substitute Eq. (8.98f) into the 
original differential equation and determine the coefficients dj exactly as in 
Section 8.5. It is worth noting that no series expansion of In x is needed. In the 
substitution In x will drop out; its derivatives will survive. 

The second solution will usually diverge at the origin because of the loga¬ 
rithmic factor and the negative powers of x in the series. For this reason, f/ 2 ( [x ) 
is often referred to as the irregular solution. The first series solution, yi(x), 
which usually converges at the origin, is the regular solution. The question of 
behavior at the origin is discussed in more detail in Chapters 11 and 12, in 
which we take up Legendre functions and Bessel functions. 

The two solutions of both sections (together with the exercises) provide a com¬ 
plete solution of our linear, homogeneous, second-order ODE—assuming 
that the point of expansion is no worse than a regular singularity. At least one 
solution can always be obtained by series substitution (Section 8.5). A second, 
linearly independent solution can be constructed by the Wronskian double 
integral [Eq. (8.83)]. No third, linearly independent solution exists. 

The inhomogeneous, linear, second-order ODE has an additional solu¬ 
tion: the particular solution. This particular solution may be obtained by the 
method of variation of parameters. 


I6 The Neumann function Fo is defined as it is in order to achieve convenient asymptotic properties 
(Section 12.3). 
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EXERCISES 


8.6.1 Legendre’s differential equation 

(1 — x 2 )y" — 2 xy' + n(n + 1 ~)y = 0 

has a regular solution P N (x) and an irregular solution Q n (c r). Show that 
the Wronskian of P„ and Q n is given by 

P n (x)Q' n (_x) - P' n (x)Q n {x) = An 

1 — x* 

with A n independent of x. 


8.6.2 Show, by means of the Wronskian, that a linear, second-order, homo¬ 
geneous ODE of the form 

y"(x) + Pix^y'Qx) + Q(x)y(x) = 0 

cannot have three independent solutions. (Assume a third solution 
and show that the Wronskian vanishes for all x.~) 


8.6.3 Transform our linear, second-order ODE 

y” + P(x)y' + Q{pc)y = 0 


by the substitution 


y — «exp 


/ X 

P(t)dt 


and show that the resulting differential equation for z is 


z" + q(pc)z — 0, 


where 

clQx) = QQx) - i P\x ) - ^P 2 {pc). 

8.6.4 Use the result of Exercise 8.6.3 to show that the replacement of (p(r ) 
by r<p(r) may be expected to eliminate the first derivative from the 
Laplacian in spherical polar coordinates. See also Exercise 2.5.15(b). 


8.6.5 By direct differentiation and substitution show that 

exp [ - f s P(t ) dt] 


m(x) = y\(x) 


r 


[2/i (s)] 2 


ds 


satisfies 


y'i{x) + P{x)y' 2 (x) + Q(x)y 2 (x) = 0. 

Note. The Leibniz formula for the derivative of an integral is 


d 

da 


rh(a ) nh(a^ 

/ / (x, a)dx = 1 

Jg(u) Jg(a) 


9/Or, a) 


da 


dx 


- f[h(a ), a 


dh(a) 

da 


fl9 («), “ 


dg{pi) 

da 
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8 .6.6 In the equation 

f x exp [ — f s P(t ) dt 1 

2/2 (pc) = y\(pc) / - L r , ' 12 -- ds, 

J [2/i (s)] 2 

y\ (x) satisfies 

2/i + P{x)y\ + Q{x)y x = 0. 

The function yt(x) is a linearly independent second solution of the 
same equation. Show that the inclusion of lower limits on the two 
integrals leads to nothing new; that is, it adds only overall factors and/or 
a multiple of the known solution y\ (x). 

8.6.7 Given that one solution of 

R'+ l -R'-%R = o 

is 72 = r™, show that Eq. (8.83) predicts a second solution, R = r~ m . 

8 . 6.8 Using y\(oc) = ]Uj)l 0 (— l) n x 2n+A /(2n + 1)! as a solution of the linear 
oscillator equation, follow the analysis culminating in Eq. (8.98f) and 
show that ci = 0 so that the second solution does not, in this case, 
contain a logarithmic term. 

8.6.9 Show that when n is not an integer the second solution of Bessel’s 
equation, obtained from Eq. (8.83), does not contain a logarithmic term. 

8.6.10 (a) One solution of Hermite’s differential equation 

y" — 2 xy' + 2ay = 0 

for a = 0 is y\(x) = 1. Find a second solution y>(x) using Eq. 
(8.83). Show that your second solution is equivalent to y Q dd 
(Exercise 8.5.6). 

(b) Find a second solution for a = 1, where y\(x) — x, using Eq. (8.83). 
Show that your second solution is equivalent to y even (Exercise 
8.5.6). 

8.6.11 One solution of Laguerre’s differential equation 

xy" + (1 — x)y' + ny = 0 

for n = 0 is y\ (x) = 1. Using Eq. (8.83), develop a second, linearly 
independent solution. Exhibit the logarithmic term explicitly. 

8.6.12 For Laguerre’s equation with n = 0 



(a) Write y-i (x) as a logarithm plus a power series. 

(b) Verify that the integral form of y-j (x), previously given, is a solu¬ 
tion of Laguerre’s equation (n = 0) by direct differentiation of the 
integral and substitution into the differential equation. 

(c) Verify that the series form of yi(x), part (a), is a solution by differ¬ 
entiating the series and substituting back into Laguerre’s equation. 
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8.6.13 The radial Schrodinger wave equation has the form 


h d 2 hr 

2m dr 2 2 mr 2 


+ V(r) 


y(r) = Ey(r). 


The potential energy V (r) may be expanded about the origin as 

6_i 

V(f)= -h b 0 + bi_r-\ -. 

r 

(a) Show that there is one (regular) solution starting with r l+1 . 

(b) From Eq. (8.83) show that the irregular solution diverges at the 
origin as r~ l . 

8.6.14 Show that if a second solution, y 2 , is assumed to have the form y 2 (x) = 
2 /iO*0/C*0> substitution back into the original equation 


leads to 


f(x) = 


y'z + P(x)y' 2 + Q(x)y 2 = 0 
exp [ - f s P(t ) dt] 


-f 


[2/i(s)] 2 


ds, 


in agreement with Eq. (8.83). 


8.6.15 If our linear, second-order ODE is nonhomogeneous—that is, of the 
form of Eq. (8.44)—the most general solution is 

2 /0*0 = i/i 0*0 + i /2 0*0 + y„(x). 

(?/i and 2/2 are independent solutions of the homogeneous equation.) 
Show that 


i/pO*) = 2 / 2 O*;) J 


?/i(s)F(s)ds 


- 2/i 0*0 


r 


y 2 (s)F(s)ds 
W{i/i(s), i/ 2 00} 1 


W'fi/iCs), ?/ 2 (s)} 
where W{y\(x), y 2 (x)} is the Wronskian of y\ (.s) and y 2 (s). 

Hint. Let y p (x) — ]J\ (x)v(x) and develop a first-order ODE for v'(x). 


8.6.16(a) Show that 


4x 2 


-y= o 


has two solutions: 


y i0*0 = a 0 x 


= «^ d +“)/ 2 


y 2 (x) = a 0 x (1 “ )/2 . 

(b) For a = 0 the two linearly independent solutions of part (a) reduce 
to 2/10 = a oX 1/2 . Using Eq. (8.83) derive a second solution 

2/20 0*0 = ao^ 1/2 \nx. 

Verify that 1/20 is indeed a solution. 










464 


Chapter 8 Differential Equations 


(c) Show that the second solution from part (b) may be obtained as a 
limiting case from the two solutions of part (a): 

2/20 (X) = lim (———^ . 

a->0 \ a J 


8.7 Numerical Solutions 


The analytic solutions and approximate solutions to differential equations in 
this chapter and in succeeding chapters may suffice to solve the problem 
at hand, particularly if there is some symmetry present. The power series 
solutions show how the solution behaves at small values of x. The asymptotic 
solutions show how the solution behaves at large values of x. These limit¬ 
ing cases and also the possible resemblance of our differential equation to 
the standard forms with known solutions (Chapters 11-13) are invaluable in 
helping us gain an understanding of the general behavior of our solution. 

However, the usual situation is that we have a different equation, perhaps a 
different potential in the Schrodinger wave equation, and we want a reasonably 
exact solution. So we turn to numerical techniques. 


First-Order Differential Equations 


The differential equation involves a continuity of points. The independent vari¬ 
able x is continuous. The (unknown) dependent variable y(x) is assumed con¬ 
tinuous. The concept of differentiation demands continuity. Numerical pro¬ 
cesses replace these continua by discrete sets. We take x to have only specific 
values on a uniform grid, such as at 


Xo, .Xo + h, Xu + 2 h, Xo + 3 h, and so on, 


where h is some small interval. The smaller h is, the better the approximation 
is in principle. However, if h is made too small, the demands on machine time 
will be excessive, and accuracy may actually decline because of accumulated 
round-off errors. In practice, therefore, one chooses a step size by trial and 
error or the code adapts the step size optimally that minimizes round-off errors. 
We refer to the successive discrete values of x as x n , x n+ \, and so on, and the 
corresponding values of y(pc) as y(x„) = y n . If xo and y 0 are given, the problem 
is to find y\, then to find 2 / 2 , and so on. 


Taylor Series Solution 


Consider the ordinary (possibly nonlinear) first-order differential equation 


-f-2/0*0 = (8-99) 

ax 

with the initial condition y(x 0 ) = y^. In principle, a step-by-step solution of the 
first-order equation [Eq. (8.99)] may be developed to any degree of accuracy 
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by a Taylor expansion 


li 2 h n 

y{pc o + h) = y(x o) + hy'ipc 0 ) + — y"(x 0 ) H-t- —r y^ixo) H-, (8.100) 

2! m! 

assuming the derivatives exist and the series is convergent. The initial value 
y(x o) is known and y ! (x {] ) is given as f(xi), yo). In principle, the higher deriva¬ 
tives may be obtained by differentiating y'(pc) = f(pc, y). In practice, this dif¬ 
ferentiation may be tedious. Now, however, this differentiation can be done by 
computer using symbolic software, such as Mathematica, Maple, or Reduce, 
or numerical packages. For equations of the form encountered in this chapter, 
a computer has no trouble generating and evaluating 10 or more derivatives. 

The Taylor series solution is a form of analytic continuation (Section 6.5). 
If the right-hand side of Eq. (8.100) is truncated after two terms, we have 


V\ = Vo + hyo = y 0 + hf(x 0 , yo),..., y n+ i = y n + hf(x n , y n ), (8.101) 

neglecting the terms of order h 2 . Equation (8.101) is often called the Euler 
solution. Clearly, it is subject to serious error with the neglect of terms of 
order h 2 . Let us discuss a specific case. 


EXAMPLE 8.7.1 


Taylor Series Approximation for First-Order ODE Because there is no 
general method for solving first-order ODEs, we often resort to numerical 
approximations. From y' = f(x, y), we obtain by differentiation of the ODE 


„ 9 /, . 9 / , 

y = y)+ y)y , 

dx 3 y 


3 2 f 

= ^ X ’ y) + ' 


3 / 

■ • + y)y 
3 y 


etc. Starting from the point (xy, y 0 ), we determine y(x o), y'(xo), y"(xo), ... 
from these derivatives of the ODE and plug them into the Taylor expansion in 
order to get to a neighboring point, from which we continue the process. 

To be specific, consider the ODE y 1 + y 2 = 0, whose analytic solution 
through the point (x — 1, y = 1) is the hyperbola y(x) = \/x. Following the 
approximation method we just outlined, we find 


y" = -2 yy\ y'" = -2 (yy" + y' 2 \ y^ = -2 (yy'" + 3 y'y"),.... 


The resulting Taylor series 


y(x) = 2 /( 1 ) + (x- l)y'(l) + ^ 2 ^ y"( 1) + <2 —^-y"'(X) - 
= 1 - (x - 1) + (x - l) 2 - (x - l) 3 + (x - l) 4 + • ■ • = 


1 


l + (x- 1) 


1 

X 


for 0 < x < 2 indeed confirms the exact solution and extends its validity 
beyond the interval of convergence. ■ 
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Runge-Kutta Method 


The Runge-Kutta method is a refinement of Euler’s approximation [Eq. (8.101)]. 
The fourth-order Runge-Kutta approximation has an error of order h 6 . The rel¬ 
evant formulas are 


1 

Vn+1 = y n + z [*o + 2fci + 2 k 2 + ka], (8.102) 

b 

where 


k 0 = hf(x n , y n ), 

ki = hf^x n + ^ h, y n + ^k 0 

( 1 1 

k 2 = hf lx n + -h, y n + -k x 

k 3 = hf(x n + h, y n + fc 2 ). 


(8.103) 


The basic idea of the Runge-Kutta method is to eliminate the error terms order 
by order. A derivation of these equations appears in Ralston and Wilf 16 (see 
Chapter 9 by M. J. Romanelli) and in Press et al. 17 

Equations (8.102) and (8.103) define what might be called the classic fourth- 
order Runge-Kutta method (accurate through terms of order h 4 ). This is the 
form followed in Sections 15.1 and 15.2 of Press et al. Many other Runge-Kutta 
methods exist. Lapidus and Seinfeld (see Additional Reading) analyze and com¬ 
pare other possibilities and recommend a fifth-order form due to Butcher as 
slightly superior to the classic method. However, for applications not demand¬ 
ing high precision and for not so smooth ODEs the fourth-order Runge-Kutta 
method with adaptive step size control (see Press et al., Chapter 15) is 
the method of choice for numerical solutions of ODEs. In general, but not al¬ 
ways, fourth-order Runge-Kutta is superior to second-order and higher order 
Runge-Kutta schemes. From this Taylor expansion viewpoint the Runge-Kutta 
method is also an example of analytic continuation. 

For the special case in which dy/dx is a function of x alone [f(x, y) in Eq. 
(8.99) —► / (r)] , the last term in Eq. (8.102) reduces to a Simpson rule numerical 
integration from x n to x n +i- 

The Runge-Kutta method is stable, meaning that small errors do not get 
amplified. It is self-starting, meaning that we just take the Xq and y 3 and away 
we go. However, it has disadvantages. Four separate calculations of /( x , y) 
are required at each step. The errors, although of order h 5 per step, are not 
known. One checks the numerical solution by cutting h in half and repeating 
the calculation. If the second result agrees with the first, then h was small 
enough. Finally, the Runge-Kutta method can be extended to a set of coupled 


16 Ralston, A., and Wilf, H. S. (Eds.) (1960). Mathematical Methods for- Digital Computers. Wiley, 
New York. 

17 Press, W. II., Flannery, B. P., Teukolsky, S. A., and Vetterling, W. T. (1992). Numerical Recipes, 
2nd ed. Cambridge Univ. Press, Cambridge, UK. 
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first-order equations: 

du dv 

— = /, (x, u ,— = f 2 (x, u, t>), and so on, (8.104) 

dx dx 

with as many dependent variables as desired. Again, Eq. (8.104) may be non¬ 
linear, an advantage of the numerical solution. 

For high-precision applications one can also use either Richardson’s ex¬ 
trapolation in conjunction with the Burlish-Stoer method 18 or the predictor- 
corrector method described later. Richardson’s extrapolation is based on ap¬ 
proximating the numerical solution by a rational function that can then be 
evaluated in the limit of step size h -» 0. This often allows for a large actual 
step size in applications. 



Predictor-Corrector Methods 

As an alternate attack on Eq. (8.99), we might estimate or predict a tentative 
value of y n + i by 


y n + 1 = Un -1 + 2 hy' n = y n _i + 2hf(x n , ?/„). (8.105) 


This is not quite the same as Eq. (8.101). Rather, it may be interpreted as 

Ay _ y n+ 1 - ?/ w _i 
Ax 2 h 

the derivative as a tangent being replaced by a chord. Next, we calculate 


Vn 


(8.106) 


y'r+l = f(%n+ 1, 2/n+l). (8.107) 

Then to correct for the crudeness of Eq. (8.105), we take 

h 

Vn +1 = Vn + g (£«+l + Vn)- (8.108) 

Here, the finite difference ratio Ay/h is approximated by the average of the 
two derivatives. This technique—a prediction followed by a correction (and 
iteration until agreement is reached)—is the heart of the predictor-corrector 
method. It should be emphasized that the preceding set of equations is intended 
only to illustrate the predictor-corrector method. The accuracy of this set (to 
order h 3 ) is usually inadequate. 

The iteration [substituting y n+ \ from Eq. (8.108) back into Eq. (8.107) and 
recycling until y n+ \ settles down to some limit] is time-consuming in a com¬ 
puter run. Consequently, the iteration is usually replaced by an intermediate 
step (the modifier) between Eqs. (8.105) and (8.107). This modified predictor- 
corrector method has the major advantage over the Runge-Kutta method of 
requiring only two computations of f(x, y) per step instead of four. Unfortu¬ 
nately, the method as originally developed was unstable—small errors (round¬ 
off and truncation) tended to propagate and become amplified. 


18 See Section 15.4 of Press et al. and also Stoer, J., and Burlirsch, R. (1980). Introduction to 
Numerical Analysis (Chap. 7). Springer-Verlag, New York. 
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This very serious problem of instability has been overcome in a version of 
the predictor-corrector method devised by Hamming. The formulas (which are 
moderately involved), apartial derivation, and detailed instructions for starting 
the solution are all given by Ralston (Chapter 8 of Ralston and Wilf). Hamming’s 
method is accurate to order h 4 . It is stable for all reasonable values of It and 
provides an estimate of the error. Unlike the Runge-Kutta method, it is not self¬ 
starting. For example, Eq. (8.105) requires both y n -\ and y n . Starting values 
(2/o, 2/i, 2 / 2 , 2 / 3 ) for the Hamming predictor-corrector method may be computed 
by series solution (power series for small x and asymptotic series for large x) or 
by the Runge-Kutta method. The Hamming predictor-corrector method may 
be extended to cover a set of coupled first-order ODEs—that is, Eq. (8.104). 

Second-Order ODEs 

Any second-order differential equation, 

y" (pc) + P(x)y'(x) + Q(x)y(x) = F(x ), (8.109) 

may be split into two first-order ODEs by writing 

y'(x) = z(x) (8.110) 

and then 

z'(x) + P(x)z(x ) + Q(x)y(x) = F(x). (8.111) 

These coupled first-order ODEs may be solved by either the Runge-Kutta or 
Hamming predictor-corrector techniques previously described. The Runge- 
Kutta-Nystrom method for second-order ODEs is a more accurate version 
that proceeds via an intermediate auxiliary y' n+l . The form of Eqs. (8.102) and 
(8.103) is assumed and the parameters are adjusted to fit a Taylor expansion 
through h 4 . 

As a final note, a thoughtless “turning the crank” application of these power¬ 
ful numerical techniques is an invitation to disaster. The solution of a new and 
different differential equation will usually involve a combination of analysis 
and numerical calculation. There is little point in trying to force a Runge-Kutta 
solution through a singular point (see Section 8.4) where the solution (or y' or 
y") may blow up. For a more extensive treatment of computational methods 
we refer the reader to Garcia (see Additional Reading). 


EXERCISES 

8.7.1 The Runge-Kutta method, Eq. (8.102), is applied to a first-order ODE 
dy/dx = f(x). Note that this function f(x) is independent of y. Show 
that in this special case the Runge-Kutta method reduces to Simpson’s 
rule for numerical quadrature. 

8.7.2 (a) A body falling through a resisting medium is described by 

dv 
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(for a retarding force proportional to the velocity). Take the con¬ 
stants to be g — 9.80 (m/sec 2 ) and a — 0.2 (sec -1 ). The initial 
conditions are t = 0, v — 0. Integrate this equation out to t = 20.0 
in steps of 0.1 sec. Tabulate the value of the velocity for each whole 
second, ij(l.O), i>(2.0), and so on. If a plotting routine is available, 
plot v(t ) versus t. 

(b) Calculate the ratio of r(20.0) to the terminal velocity v(oc). 

Check value. i’(10) = 42.369 m/sec. 


A/VS', (b) 0.9817. 


8.7.3 Integrate Legendre’s differential equation (Exercise 8.5.5) from x = 0 
to x = 1 with the initial conditions 2/(0) = 1, y' (0) = 0 (even solution). 
Tabulate y(x) and dy/dx at intervals of 0.05. Take n= 2. 


8.7.4 The Lane-Emden equation of astrophysics is 

d 2 y 2 dy s q 

dx 2 xdx ^ 

Take y{ 0) = 1, y'( 0) = 0, and investigate the behavior of y(x) for 
s — 0, 1, 2, 3, 4, 5, and 6. In particular, locate the first zero of y(x). 
Hint. From a power series solution //"(0) = — g. 

Note. For s = 0, y(x) is a parabola; for s = 1, a spherical Bessel 
function, j»(x). As s -> 5, the first zero moves out to oo, and for s > 
5, y(x) never crosses the positive .r-axis. 

ANS. For y(Xs) = 0, x Q = 2.45(«V6), 
x\ = 3.14(RS7r), X 2 — 4.35, 
x% = 6.90. 


8.7.5 As a check on Exercise 8.6.10(a), integrate Hermite’s equation (x = 0) 


d 2 y 

dx 2 


2x— — 0 
dx 


from x = 0 out to x = 3. The initial conditions are ?/(()) = 0, y' (0) = 1. 
Tabulate 2/(1), y( 2), and y( 3). 


ANS. ?/(l) = 1.463, y(2) = 16.45, 2/(3) = 1445. 


8.7.6 Solve numerically ODE 3 of Exercise 8.3.15 using the Euler method, 
then compare with the Runge-Kutta method. For several fixed x, plot 
on a log-log scale | y(x) — y m im(x)\ versus step size, where y(x) is your 
analytic and 2/num your numerical solution. Find the best step size. 

8.7.7 Solve numerically Bessel’s ODE in Eq. (8.62) for n= 0, 1, 2 and calcu¬ 
late the location of the first two roots J n (a ns ) — 0. 

Check value. ai 2 = 7.01559. 


8.7.8 Solve numerically the pendulum ODE 
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with a harmonically driven pivot. Choose your step size according to 
the driving frequency ca and pick suitable parameters l, a, a>. Include 
and discuss the case g <SC a. 

8.7.9 Solve numerically the ODE of Exercise 8.2.20. Compare with a Runge- 
Kutta result. 

8.7.10 Solve numerically the ODE of Exercise 8.2.21. 

8.7.11 Solve numerically the ODE of Exercise 8.2.22. 


8.8 Introduction to Partial Differential Equations 


The dynamics of many physical systems involve second-order derivatives, such 
as the acceleration in classical mechanics and the kinetic energy, ~V 2 , in 
quantum mechanics, and lead to partial differential equations (PDEs) in time 
and one or more spatial variables. 

Partial derivatives are linear operators: 

d{a(p{pc, y ) + bf(x, yj) d<p(x, y) y) 

-= a -1- b -, 

dx dx dx 

where a, b are constants. Similarly, if £ is an operator consisting of (partial) 
derivatives, and the operator £ is linear, 

£(cn/q + bf 2 ) = a£\jri + b£\J / 2 , 
and the PDE may be cast in the form 

AA = F , 

where F is the external source, independent of (/(<, r), the unknown function. 
If F = 0 the PDE is called homogeneous; if F ^ 0 the PDE is inhomo¬ 
geneous. For homogeneous PDEs the superposition principle is valid: If 
Vo, 1//2 are solutions of the PDE, so is any linear combination a\jr\ +b\jr 2 . If £ 
contains first-order partial derivatives at most, the PDE is called first order; if 
£ contains second-order derivatives, such as V 2 , but no higher derivatives, we 
have a second-order PDE, etc. Second-order PDEs with constant coefficients 
occur often in physics. They are classified further into elliptic PDEs if they 
involve V 2 or V 2 + c _2 3 2 /3£ 2 , parabolic with ad/dt + V 2 , and hyperbolic 
with operators such as c _2 3 2 /3f 2 — V 2 . Hyperbolic (and some parabolic) PDEs 
have waves as solutions. 


8.9 Separation of Variables 


Our first technique for solving PDEs splits the PDE of n variables into n ordinary 
differential equations. Each separation introduces an arbitrary constant of 
separation. If we have n variables, we have to introduce n — 1 constants, 
determined by the conditions imposed in the problem being solved. Let us 
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start with the heat flow equation for a rectangular metal slab. The geometry 
suggests using Cartesian coordinates. 19 


Cartesian Coordinates 

In a homogeneous medium at temperature x// (r) that varies with location, heat 
flows from sites at high temperature to lower temperature in the direction 
of negative temperature gradient. We assume that appropriate heat sources 
are present on the boundaries to produce the boundary conditions. The heat 
flow must be of the form j = — atViA, where the proportionality constant k 
measures the heat conductivity of the medium, a rectangular metal slab in our 
case. The current density is proportional to the velocity of the heat flow. If 
the temperature increases somewhere, this is due to more heat flowing into 
that particular volume element d 3 r than leaves it. From Section 1.6, we know 
that the difference is given by the negative divergence of the heat flow; that is, 
— V ■ j d 3 r — KV 1 \j/d; ] r. On the other hand, the increase of energy with time is 
proportional to the change of temperature with time, the specific heat a, and 
the mass pd\ where p is the density, taken to be constant in space and time. 
In the absence of sources, we obtain the heat flow equation 


3 xfr 
~dt 


— V 2 *, 

op 


( 8 . 112 ) 


a parabolic PDE. For the simplest, time-independent steady-state case we 
have dxjr/dt = 0, and the Laplace equation results. In Cartesian coordinates 
the Laplace equation becomes 


3 2 i lr d 2 \l/ d 2 \lr 

—— -I-— H-— = 0 

3a: 2 3 y 2 3 z 2 


(8.113) 


using Eq. (2.5) for the Laplacian. Perhaps the simplest way of treating a PDE 
such as Eq. (8.113) is to split it into a set of ODEs. This may be done with a 
product ansatz, or trial form, for 


* {x, y, z) = X{x)Y{y)Z{z), (8.114) 


and then substitute it back into Eq. (8.113). How do we know Eq. (8.114) is 
valid? We are proceeding with trial and error. If our attempt succeeds, then Eq. 
(8.114) will be justified. If it does not succeed, we shall find out soon enough 
and then we try another attack. With \h assumed given by Eq. (8.114), Eq. 
(8.113) becomes 


d 2 X „„d 2 Y 
YZ —7T + XZ- 


<Pz 

' XY = °' 


dx 2 ' dy 2 

Dividing by i// = XYZ and rearranging terms, we obtain 

1 d 2 X 1 d 2 Y 1 d 2 Z 

X dx 2 Y dy 2 Z dz 2 


(8.115) 


(8.116) 


19 Boundary conditions are part of the geometry, assumed Euclidean as well, and will be discussed 
later and in Chapter 16 in more detail. 
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Equation (8.116) exhibits one separation of variables. The left-hand side is a 
function of the variable x alone, whereas the right-hand side depends only on y 
and z. However, x, y, and z are all independent coordinates. This independence 
means that the recipe of Eq. (8.114) worked; the left-hand side of Eq. (8.116) 
depends on x only, etc. That is, the behavior of x as an independent variable is 
not determined by y and z. Therefore, each side must be equal to a constant— 
a constant of separation. 

The choice of sign, completely arbitrary here, will be fixed in specific prob¬ 
lems by the need to satisfy specific boundary conditions, which we need to 
discuss now. Let us put one corner of the slab in the coordinate origin with 
its sides along the coordinate axes and its bottom sitting at z = 0 with a given 
temperature distribution x/s(x, y, z = 0) = i/o(x, y). To simplify the problem 
further, we assume that the slab is finite in the x- and y-directions but infinitely 
long in the ^-direction with zero temperature as z —> +oo. This is a reason¬ 
able assumption, as long as we are not interested in the temperature near the 
end of the slab (z -> oo). We take the lengths of the slab in the transverse 
x, y-directions to be the same, 2tt. Now we choose 


1 d 2 X 
X~dofi 


(8.117) 


because the sin lx, cos lx solutions with integer l implied by the boundary con¬ 
dition allow for a Fourier expansion to fit the x-dependence of the temperature 
distribution i//o at z = 0 and fixed y. [Note that if a is the length of the slab in the 
^-direction, we write the separation constant as —(2 ttI/o) 2 with integer?, and 
this boundary condition gives the solutions cy sm(Znlx/a) + b t cosCZrclx/a). If 
the temperature at x = 0, x = a and z — 0 is zero, then all bi = 0 and the 
cosine solutions are ruled out.] 

Returning to the other half of Eq. (8.116), it becomes 


1 d 2 Y 1 d 2 Z 
Ydy* ~ 


(8.118) 


Rewriting it so as to separate the y- and ^-dependent parts, we obtain 


1 d 2 Y _ 2 1 d 2 Z 

Ydip ~ ~ Z~d^’ 


(8.119) 


and a second separation has been achieved. Here we have a function of y 
equated to a function of z as before. We resolve the situation as before by 
equating each side to another (negative) constant of separation, —m 2 , 


1 d 2 Y 
Ydy* 


2 

= —m , 


( 8 . 120 ) 


1 d 2 Z 999 

zd*= l+m=n ' (8J21) 

introducing a positive constant n 2 by n 2 = l 2 + to 2 to produce a symmetric 
set of equations in x, y. As a consequence, the separation constant in the 
^-direction is positive, which implies that its solution is exponential, e ±nz . 
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We discard solutions containing e +nz because they violate the boundary con¬ 
dition for large values of z, where the temperature goes to zero. Now we have 
three ODEs [Eqs. (8.117), (8.120), and (8.121)] to replace the PDE [Eq. (8.113)]. 
Our assumption [Eq. (8.114)] has succeeded and is thereby justified. 

Our solution should be labeled according to the choice of our constants 
l, to, and to; that is, 


i hm(x, V, z) = x i(x)Y m (y)Z n (z), ( 8 .122) 

subject to the conditions of the problem being solved and to the condition 
to 2 = l 2 + to 2 . We may choose l and to as we like and Eq. (8.122) will still be a 
solution of Eq. (8.113), provided X t (x) is a solution of Eq. (8.117), and so on. 
We may develop the most general solution of Eq. (8.113) by taking a linear 
combination of solutions by the superposition principle 


'P (x, y,z) = Y^ (8.123) 

l,m 

because the Laplace equation is homogeneous and linear. The constant coef¬ 
ficients ai m are chosen to permit VP to satisfy the boundary condition of the 
problem at z— 0, where all Z n are normalized to Z n ( 0) = 1, so that 

fo(x, y) = Y2 aimXl 0*0 ( 8 - 124 ) 

l,m 

In other words, the expansion coefficients ai. m of our solution T in Eq. (8.123) 
are uniquely determined as Fourier coefficients of the given temperature dis¬ 
tribution at the boundary z= 0 of the metal slab. A specific case is treated in 
the next example. 


EXAMPLE 8.9.1 


Cartesian Boundary Conditions Let us consider a case in which, at z — 0, 
i// (x, y, 0) = ir 0 (x, y) = 100°C (the boiling point of water) in the area — 1 < 
x < 1,-1 < y < 1 , an input temperature distribution on the z = 0 plane. 
Moreover, the temperature rj/ is held at zero, i// = 0 (the freezing point of 
water) at the end points x = ± 1 for all y and z, and at y = ±1 for all x and z, a 
boundary condition that restricts the temperature spread to the finite area of 
the slab in the x, ^-directions. Also, i jr(x, y, z) —> 0 for z-x oo for all x, y. The 
entire slab, except the z = 0 plane, is in contact with a constant-temperature 
heat bath, whose temperature can be taken as the zero of Eq. (8.113) is 
invariant with respect to the addition of a constant to i/o 

Because of the adopted boundary conditions, we choose the solutions 
cos with integer l that vanish at the end points x = ±1 and the corre¬ 
sponding ones in the ^-direction, excluding l = 0 and X, Y — const. Inside the 
interval at 2 = 0 , therefore, we have the (Fourier) expansion 
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with coefficients (see Section 14.1) 



ai — 0 , l — 2/i 


for integer \i. The same Fourier expansion (now without the factor 100) applies 
to the y-direction involving the integer summation index v in Y(y), whereas the 
^-dependence is given by Eq. (8.121), so that the complete solution becomes 




n 2 = (2/r + l) 2 + (2v + l) 2 . 


For z > 0 this solution converges absolutely, but at z = 0 there is only 
conditional convergence for each sum that is caused by the discontinuity at 
z = 0, x = ±1, y = ±1. ■ 


Circular Cylindrical Coordinates 


Let us now consider a cylindrical, infinitely long metal rod with a heat source at 
z — 0 that generates a given steady-state temperature distribution x// 0 (p, <p) = 
i/f(p, cp, 0) at z = 0 in a circular area about the origin and zero temperature 
at large values of z for all p and <p. For the method of separation of variables 
to apply, the radial boundary condition needs to be independent of z and <p. 
Choices that lead to reasonable physical situations are (i) zero temperature at 
p — a (radius of the rod, corresponding to immersion of the rod in a constant- 
temperature bath) and (ii) zero gradient of temperature at p = a (correspond¬ 
ing to no lateral flow out of the rod). This choice will lead to a situation in 
which the lateral temperature distribution for large z will approach a uniform 
value equal to the average temperature of the area at z = 0. We want to find 
the temperature distribution for 2 > 0 for all p and cp under such conditions. 

With our unknown function i// dependent on p, <p, and z, the Laplace equa¬ 
tion becomes (see Section 2.2 for V 2 ) 



(8.125) 


As before, we assume a factored form for \jr, 

tKp, <P, z) = 

Substituting into Eq. (8.125), we have 


(8.126) 



(8.127) 
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All the partial derivatives have become ordinary derivatives because each func¬ 
tion depends on a single variable. Dividing by the product P<$Z and moving 
the z derivative to the right-hand side yields 

1 d / dP\ 1 d 2 <b _ 1 d 2 Z 
pP dp \ P dp) + p 2 <J> dip 2 Z dz 2 ' 

Again, a function of z on the right depends on a function of p and <p on the 
left. We resolve this paradox by setting each side of Eq. (8.128) equal to the 
same constant. Let us choose —n 2 . Then 


and 


d 2 Z 

dz 2 


= n 2 Z, 


(8.129) 


1 d / dP\ 1 d 2 d> 
pP dp \ P dp) + p 2 4> dip 2 


(8.130) 


From Eq. (8.130) we find the already familiar exponential solutions Z ~ e ±nz , 
from which we discard e + " z again because the temperature goes to zero at 
large values of z > 0 . 

Returning to Eq. (8.131), multiplying by p 2 and rearranging terms, we 
obtain 20 


p d 
P dp 



• n 2 p 2 


1 d 2 <t> 

dip 2 ' 


(8.131) 


We then set the right-hand side to the positive constant m 2 to obtain 

d 2 <J>0p) 


dip 2 


— —rrf&Qp). 


(8.132) 


Finally, as an illustration of how the constant min Eq. (8.133) is restricted, we 
note that ip in cylindrical and spherical polar coordinates is an azimuth angle. If 
this is a classical problem, we shall certainly require that the azimuthal solution 
4>(<p) be single-valued; that is, 


<t >(ip + 2: r) = <t>(<p), (8.133) 

which yields the periodic solutions <t>(ip) ~ e ±imp for integer m This is equiva¬ 
lent to requiring the azimuthal solution to have a period of 2tt or some integral 
multiple of it . 21 Therefore, mmust be an integer. Which integer it is depends on 
the details of the boundary conditions of the problem. This is discussed later 
and in Chapter 9. Whenever a coordinate corresponds to an azimuth angle the 
separated equation always has the form of Eq. (8.132). 

Finally, for the p dependence we have 

p-^-(P~r^\ + (n 2 p 2 - m 2 )P = 0. (8.134) 

dp \ dp J 


20 The choice of sign of the separation constant is arbitrary. However, a minus sign is chosen for 
the axial coordinate z in expectation of a possible exponential dependence on z > 0. A positive 
sign is chosen for the azimuthal coordinate <p in expectation of a periodic dependence on ip. 

21 This also applies in most quantum mechanical problems but the argument is much more involved. 
If to is not an integer, or half an integer for spin | particles, rotation group relations and ladder 
operator relations (Section 4.3) are disrupted. Compare Merzbacher, E. (1962). Single valuedness 
of wave functions. Am. J. Phys. 30, 237. 









476 


Chapter 8 Differential Equations 


This is Bessel’s ODE. The solutions and their properties are presented in 
Chapter 12. We emphasize here that we can rescale the variable p by a con¬ 
stant in Eq. (8.134) so that P must be a function of np and also depend on the 
parameter in; hence the notation P m (np'). Because the temperature is finite at 
the center of the rod, p = 0, P m must be the regular solution J m of Bessel’s 
ODE rather than the irregular solution, the Neumann function Y m . In case of 
boundary condition (i), J m (na ) = 0 will require na to be a zero of the Bessel 
function, thereby restricting n to a discrete set of values. The alternative (ii) 
requires d.J m J dp\ fl(l=a — 0 instead. To fit the solution to a distribution i// 0 at 
z — 0 one needs an expansion in Bessel functions and to use the associated 
orthogonality relations. 

The separation of variables of Laplace’s equation in parabolic coordinates 
also gives rise to Bessel’s equation. The Bessel equation is notorious for the 
variety of disguises it may assume. For an extensive tabulation of possible 
forms the reader is referred to Tables of Functions by Jahnke and Emde. 22 

The original Laplace equation, a three-dimensional PDE, has been replaced 
by three ODEs [Eqs. (8.129), (8.132), and (8.134)]. A solution of the Laplace 
equation is 


1 Amm(P; — Pm(jl / P)^m(fP)^n{Z)- (8.135) 

Identifying the specific P, <J>, Z solutions by subscripts, we see that the most 
general solution of the Laplace equation is a linear combination of the 
product solutions: 


V(P, V, z) = E Q"mnPm [np)(b, m ((p)Z n {z'). (8.136) 

m,n 

Here, the coefficients a mn are determined by the Bessel-Fourier expansion of 
the boundary condition at z = 0, where the given temperature distribution (/ (l 
has to obey 


Mb, V) — *y (buttPm(np)^ w ((p) (8.137) 

m,n 

because all Z n (z — 0) = 1. Recall that na is restricted to a discrete set of values 
by the radial boundary condition, and m are integers. 


Spherical Polar Coordinates 


For a large metal sphere and spherical boundary conditions, with a tempera¬ 
ture distribution on the surface f(r = a, f), (p) = fn(0, <p) generated by a heat 
source at the surface r — a, let us separate the Laplace equation in spherical 


22 Jahnke, E., and Emde, F. (1945). Tables of Functions, 4th rev. ed. (p. 146). Dover, New York; 
also, Jahnke, E., Emde, F., and Ldsch, F. (1960). Tables of Higher Functions, 6th ed. McGraw-Hill, 
New York. 
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polar coordinates. Using Eq. (2.77), we obtain 


1 


9 / 2 dc[f\ 9 

sin d — (r 2 — ) H- 

dr \ dr) dO 


sm i 


do 


1 9 2 i j/ 1 


= 0. 


sind dcp 2 

Now, in analogy with Eq. (8.114) we try a product solution 

i/f(r, d, cp ) = f2(r)©(d)4>(^). 

By substituting back into Eq. (8.138) and dividing by i?0O, we have 


(8.138) 


(8.139) 


1 d 


>dR 


1 


d 


d® 
sind — 
dO 


d 2 O 


= 0. (8.140) 


Rr 2 dr\ dr J ®r 2 sin 6 dO \ dO ) Or 2 sin 2 0 dcp 2 
Note that all derivatives are now ordinary derivatives rather than partials. By 
multiplying by r 2 sin 2 0, we can isolate (l/0)(ri 2 0/Yi<p 2 ) to obtain 


23 


1 d 2 O 
O dcp 2 


= r 2 sin 2 ( 


1 d 

r 2 R dr 


,dR 


1 


d 


sin 


d® 

~dd 


(8.141) 


dr J r 2 sm.6®d0 
Equation (8.141) relates a function of cp alone to a function of r and 0 alone. 
Since r, 6, and cp are independent variables, we equate each side of Eq. (8.141) 
to a constant. In almost all physical problems cp will appear as an azimuth 
angle. This suggests a periodic solution rather than an exponential so that O 
is single-valued. With this in mind, let us use —m 2 as the separation constant. 
Then 

1 d 2 d>(<p) 


and 


1 d 

r 2 R dr 


2 dR 

dr 


4> dcp 2 


d 


— — TO 


r 2 sind© d6 


sind- 


,d© 

~d0 


mr 


r 2 sin 2 d 


= 0 . 


Multiplying Eq. (8.143) by r 2 and rearranging terms, we obtain 


1 d 
R dr 


^ 2 cLR\ _ 
dr J 


d 


sind© dd 


■ „d®\ 
sind— ) + 




(8.142) 


(8.143) 


(8.144) 


dd J sin" 

Again, the variables are separated. We equate each side to a constant Q and 
finally obtain 

1 d 


sin d dO 


d®\ 

m 2 


sind— ) — 

— 2—0+ Q® = 0, 

(8.145) 

de J 

sin 2 d 


d / 2 dR\ 

QR 


1 dr \ dr J 

r 2 ~ 

(8.146) 


Once more we have replaced a PDE of three variables by three ODEs. The 
solutions of these ODEs are discussed in Chapters 11 and 12. In Chapter 11, 
for example, Eq. (8.145) is identified as the associated Legendre equation 
in which the constant Q becomes l(l + 1); l is a positive integer. The radial 


23 The order in which the variables are separated here is not unique. Many quantum mechanics 
texts show the r dependence split off first. 
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Eq. (8.146) has powers R ~ r l ,r~ l ~ l for solutions so that Q = 1(1 +1) is 
maintained. These power solutions occur in the multipole expansions of elec¬ 
trostatic and gravitational potentials, the most important physical applications. 
The corresponding positive power solutions are called harmonic polynomials, 
but the negative powers are required for a complete solution. The boundary 
conditions usually determine whether or not the negative powers are retained 
as (irregular) solutions. 

Again, our most general solution may be written as 

0,<p)='^2, (r)&hn(6 ) O m (<p). (8.147) 


The great importance of this separation of variables in spherical polar 
coordinates stems from the fact that the method covers a tremendous amount 
of physics—many of the theories of gravitation, electrostatics, atomic, nuclear, 
and particle physics, where the angular dependence is isolated in the same 
Eqs. (8.142) and (8.145), which can be solved exactly. In the hydrogen atom 
problem, one of the most important examples of the Schrodinger wave equa¬ 
tion with a closed form solution, the analog of Eq. (8.146) for the hydrogen 
atom becomes the associated Laguerre equation. 

Whenever a coordinate z corresponds to an axis of translation the separated 
equation always has the form 


d 2 Z(z ) 
dz 2 


= ±.arZ(z) 


in one of the cylindrical coordinate systems. The solutions, of course, are sin az 
and cos az for —a 2 and the corresponding hyperbolic function (or exponen¬ 
tials) sinh az and cosh az for +a 2 . 

Other occasionally encountered ODEs include the Laguerre and associated 
Laguerre equations from the supremely important hydrogen atom problem in 
quantum mechanics: 


+ t 1 

dx 2 


. dy , n 
■x)—+ay= 0, 
dx 


(8.148) 


d 2 y du 

x—% + (1 + k — x)— + ay = 0. (8.149) 

dx 2 dx 

From the quantum mechanical theory of the linear oscillator we have Hermite’s 
equation, 

d 2 y dy 

—^-2x—+2ay = 0. (8.150) 

dx 2 dx 

Finally, occasionally we find the Chebyshev differential equation 

o d 2 y dy 9 

(1 - x 2 )— ? - x -^- + n 2 y= 0. (8.151) 

dx 2 dx 

More ODEs and two generalizations of them will be examined and sys¬ 
tematized in Chapter 16. General properties following from the form of the 
differential equations are discussed in Chapter 9. The individual solutions are 
developed and applied in Chapters 11-13. 
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The practicing physicist probably will encounter other second-order ODEs, 
some of which may possibly be transformed into the examples studied here. 
Some of these ODEs may be solved by the techniques of Sections 8.5 and 8.6. 
Others may require a computer for a numerical solution. 

• To put the separation method of solving PDEs in perspective, let us view it as 
a consequence of a symmetry of the PDE. Take the stationary Schrodinger 
equation Hty = Ety as an example with a potential V (r) depending only 
on the radial distance r. Then this PDE is invariant under rotations that 
comprise the group SO(3). Its diagonal generator is the orbital angular mo¬ 
mentum operator L z = and its quadratic (Casimir) invariant is L 2 . 

Since both commute with H (see Section 4.3), we end up with three sepa¬ 
rate eigenvalue equations: 


Hty = Ety, l/ty = IQ + l)i jr, L z ty = mty. 


Upon replacing L\ in L 2 by its eigenvalue to 2 , the L 2 PDE becomes 
Legendre’s ODE (see Exercise 2.5.12), and similarly ITty = Ety becomes the 
radial ODE of the separation method in spherical polar coordinates upon 
substituting the eigenvalue 1(1 + 1) for L 2 . 

• For cylindrical coordinates the PDE is invariant under rotations about the 
2 -axis only, which form a subgroup of SO(3). This invariance yields the 
generator L z = —id/dip and separate azimuthal ODE L z ty = mty as before. 
Invariance under translations along the 2 -axis with the generator —id/dz 
gives the separate ODE in the 2 -variable provided the boundary conditions 
obey the same symmetry. The potential V = V (p) or V = V ( 2 ) depends on 
one variable, as a rule. 

• In general, there are n mutually commuting generators //, with eigenvalues 
TOi of the (classical) Lie group G of rank n and the corresponding Casimir 
invariants C) with eigenvalues c$, which yield the separate ODEs 


Hif = mix//, City = City 


in addition to the radial ODE Hty = Ety. 

EXERCISES 

8.9.1 An atomic (quantum mechanical) particle is confined inside a rectangu¬ 
lar box of sides a, b, and c. The particle is described by a wave function 
ty that satisfies the Schrodinger wave equation 



2m 


The wave function is required to vanish at each surface of the box (but 
not to be identically zero). This condition imposes constraints on the 
separation constants and therefore on the energy E. What is the smallest 
value of E for which such a solution can be obtained? 



2 TO 
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8.9.2 The quantum mechanical angular momentum operator is given by L = 
— i(r x V). Show that 


L • L \[r = 1(1 + l)i/f 

leads to the associated Legendre equation. 

Hint. Exercises 1.8.6 and 2.5.13 may be helpful. 

8.9.3 The one-dimensional Schrodinger wave equation for a particle in a 
potential held V = ^kx 2 is 

h 2 d 2 \k 1 9 

~^nd^ + 2 kX * = E ' m ' 


(a) Using % = ax and a constant X, we have 


a = 


h 2 


E ( m \ 

h \kj 


1/2 


Show that 


(b) Substituting 


d 2 o 

+ (x- ? 2 )^G0 = 0 . 


HZ ) = y(Z)e 


-? 2 /2 


show that y(f) satisfies the Hermite differential equation. 

8.9.4 Verify that the following are solutions of Laplace’s equation: 

(a) Vh = 1 /r, (b) 1 ^ 2 =^ In 

2r r — z 

Note. The z derivatives of 1/r generate the Legendre polynomials, 
P n { cos 0) (Exercise 11.1.7). The z derivatives of (1 /2r) 1 n [ (r + z) / (r — z)\ 
generate the Legendre functions, Q„(cos 0). 

8.9.5 If 4* is a solution of Laplace’s equation, V“T = 0, show that dV/dz is 
also a solution. 
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Chapter 9 




Sturm-Liouville 
Theory—Orthogonal 
Functions 


In the preceding chapter, we developed two linearly independent solutions 
of the second-order linear homogeneous differential equation and proved 
that no third linearly independent solution existed. In this chapter, the em¬ 
phasis shifts from solving the differential equation to developing and under¬ 
standing general properties of the set of solutions. There is a close analogy 
between the concepts in this chapter and those of linear algebra in Chap¬ 
ter 3. Functions here play the role of vectors there, and linear operators 
play that of matrices in Chapter 3. The diagonalization of a real symmetric 
matrix in Chapter 3 corresponds here to the solution of an ordinary differ¬ 
ential equation (ODE), defined by a self-adjoint operator £, in terms of 
its eigenfunctions, which are the “continuous” analog of the eigenvectors 
in Chapter 3, and real eigenvalues that correspond to physically observable 
quantities in the laboratory. Just as a column eigenvector vector a is writ¬ 
ten as | a) in the Dirac notation of Chapter 3, we now write an eigenfunc¬ 
tion as | (p). In the Cartesian component a,; = x,. • a, the discrete index i of 
the coordinate unit vectors is now replaced by the continuous variable x in 
<p(x). 

In Section 9.1, the concepts of self-adjoint operator, eigenfunction, eigen¬ 
value, and Hermitian operator are presented. The concept of adjoint operator, 
given first in terms of differential equations, is then redefined in accordance 
with usage in quantum mechanics. The vital properties of reality of eigenvalues 
and orthogonality of eigenfunctions are derived in Section 9.2. In Section 9.3, 
we discuss the Gram-Schmidt procedure for systematically constructing sets 
of orthogonal functions. Finally, the general property of the completeness of 
a set of eigenfunctions is explored in Section 9.4. 
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9.1 Self-Adjoint ODEs 


In Chapter 8, we studied, classified, and solved linear, second-order ODEs cor¬ 
responding to linear, second-order differential operators of the general form 


d 2 d 

C = p 0 (x)— + pi Or)— + p 2 (x), 
dx* dx 


(9.1) 


defined over the region a < x < b. A number of restrictions apply. The co¬ 
efficients po(.x), Pi (■*'), and p->(x) are real functions of x, and the first 2 — i 
derivatives of p;(x) are continuous. Reference to Eqs. (8.42) and (8.44) shows 
that P(x) = P\(x)/p{)(x) and Q(x) — p 2 (pc) / Pq(x). Hence, Po(x) must not van¬ 
ish for a < x < b. The zeros of po(x) are singular points (Section 8.4), and 
the preceding statement means that our interval [a, b] must be such that there 
are no singular points in the interior of the interval. There may be and often 
are singular points on the boundaries. Moreover, b -» oo and/or a —> — oo are 
possible in certain problems. 

For a linear operator £, the analog of a quadratic form for a matrix in 
Chapter 3 is the integral in Dirac’s notation of Section 3.2: 


f b 

(u\£\u) = (u\£u) = / u*Qx)Cu{x)dx 


L 


= / u(p 0 u" + piv! + p 2 u) dx, 


(9.2) 


taking u = u* to be real. This integral is the continuous analog of the inner 
product of vectors in Chapter 1 and here of u and Cu. Two vectors u (x), v(x) 
are orthogonal if their inner product vanishes, 

f b 

(r| u) = / v* (x'juQxf) dx = 0. 

J a 

If we shift the derivatives to the first factor u in Eq. (9.2) by integrating by parts 
once or twice, we are led to the equivalent expression 


(' U\CU ) = [U(X\P X - PoMaOlLa 

+ f[ 


d^“ d 

-r^iPou) - —(Piu) + p 2 u 

dx z dx 


udx 


(9.3) 


= [wC*0(j>i - p'( i )u(x)] h c=a 


d^“ d 

— — 7, (P()U) - —(P\U) + p 2 u 
dx A dx 


u 


For Eqs. (9.2) and (9.3) to agree for all u, the integrands have to be equal. The 
comparison yields 


u(p'o - p\)u + 2%(po - px)v! — 0 
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or 


PoO) = Plipc). 


(9.4) 


The terms at the boundary x = a and x—b in Eq. (9.3) then also vanish. 

Because of the analogy with the transpose matrix in Chapter 3, it is conve¬ 
nient to define the linear operator in Eq. (9.3) 



(9.5) 


as the adjoint 1 operator £ so that (u\£u) = {£u\u) for all u. The necessary 
and sufficient condition that £ = £, or {u\£u) = (£u\u), is that Eq. (9.4) be 
satisfied for all u. When this condition is satisfied, 


£u= £u — — p(x) q^itlx), 

ax ax 


(9.6) 


where p(x) = Po(x) and q(x) = Pi(x), and the operator £ is said to be self- 
adjoint. The importance of the form of Eq. (9.6) is that we will be able to carry 
out two integrations by parts in Eq. (9.3). 2 

The ODEs introduced in Section 8.3, Legendre’s equation, and the linear 
oscillator equation are self-adjoint, but others, such as the Laguerre and 
Hermite equations, are not. However, the theory of linear, second-order, self- 
adjoint differential equations is perfectly general because we can always trans¬ 
form the non-self-adjoint operator into the required self-adjoint form. Consider 
Eq. (9.1) with p ' 0 ^ Pi- If we multiply £ by 3 



x The adjoint operator bears a somewhat forced relationship to the transpose matrix. A better 
justification for the nomenclature is found in a comparison of the self-adjoint operator (plus 
appropriate boundary conditions) with the symmetric matrix. These significant properties are 
developed in Section 9.2. Because of these properties, we are interested in self-adjoint operators. 
When adjoint or self-adjoint operators are discussed in the context of a Hilbert space, all functions 
of that space will satisfy the boundary conditions. 

2 The importance of the self-adjoint form (plus boundary conditions) will become apparent in 
Section 9.2, Eq. (9.22) and after. 

3 If we multiply C by f(x)/pa(x) and then demand that 



so that the new operator will be self-adjoint, we obtain 
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we obtain 


1 

p 0 (x) 


exp 



Pi(t) 

potty 


dt 


£u(u) = 


d 

dx 


exp 



Pi(0 .1 du{x) | 

Po(0 J dx J 


| P2(%) 
p 0 (x) 


• exp 



Pi(0 


dt 


Po(0 _ 


u, 


(9.7) 


which is clearly self-adjoint [see Eq. (9.6)]. Notice the JhAx) in the denominator. 
This is why we require po(x) ^ 0, a < x < b. In the following development, 
we assume that C has been put into self-adjoint form. 


Eigenfunctions and Eigenvalues 


Schrodinger’s time-independent wave equation for a single-particle system, 


H^ipc) — Exfr (pc), 

is a major example of an eigenvalue equation in physics; here, the differential 
operator £ is defined by the Hamiltonian II and the eigenvalue is the total 
energy E of the system. The eigenfunction \[r (x) is usually called a wave func¬ 
tion. A variational derivation of this Schrodinger equation appears in Section 
18.5. Based on spherical, cylindrical, or some other symmetry properties, a 
three- or four-dimensional partial differential equation (PDE) or eigenvalue 
equation, such as the Schrodinger equation, often separates into three (or 
more) eigenvalue equations in a single variable. In this context, an eigenvalue 
equation sometimes takes the more general self-adjoint form, 

£u{x ) + XwQEjuQr) = 0, or £\u) + Xw\u) — 0, (9.8) 


where the constant X is the eigenvalue, £ is self-adjoint, and w(pc ) is a known 
weight or density function; \n(x) > 0 except possibly at isolated points at 
which w(x) = 0. The Schrodinger equation for the simple harmonic oscillator 
is a particular case of Eq. (9.8) with w = 1. The analysis of Eq. (9.8) with £ as 
defined in Eq. (9.6) and its solutions is called Sturm-Liouville theory. For a 
given choice of the parameter X, a function u x (x ), which satisfies Eq. (9.8) and 
the imposed boundary conditions discussed later, is called an eigenfunc¬ 
tion corresponding to X. The constant X is then called an eigenvalue. There is 
no guarantee that an eigenfunction u x (x) will exist for an arbitrary choice of 
the parameter X. Indeed, the requirement that there be an eigenfunction often 
restricts the acceptable values of X to a discrete set. 

The inner product (or overlap integral) of two functions 

(r| u) = J v*{pc)u{x')w{x')dx 

depends on the weight function and generalizes our previous definition for 
w = 1 to w ^ 1. The latter also modifies the definition of orthogonality 
of two eigenfunctions: They are orthogonal if their overlap integral vanishes, 

1?4) = 0. Examples of eigenvalues for the Legendre and Hermite equa¬ 
tions appear in the exercises of Section 8.5. Here, we have the mathematical 
approach to the process of quantization in quantum mechanics. 
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Equation 

p O) 

qO) 

A 


Legendre 

1 —X 2 

0 

IQ + 1) 

1 

Shifted Legendre 

a;(l — x) 

0 

IQ + 1) 

1 

Associated Legendre 

1 —X 2 

-m 2 /( 1 — x 2 ) 

IQ + 1) 

1 

Bessel® 

X 

rt 

a 2 

X 

Laguerre 

xe~ x 

0 

a 

e -x 

Associated Laguerre 

yJi+lg-X 

0 

a — k 

r^k^—X 

Hermite 

e ~ x2 

0 

2 a 

e ~ x2 

Simple harmonic oscillator 6 

1 

0 

n 2 

1 


“Orthogonality of Bessel functions is special. Compare Section 12.1 for details. 
6 This will form the basis for Chapter 14, Fourier series. 


The extra weight function w(x) sometimes appears as an asymptotic wave 
function 1/^00 that is a common factor in all solutions of a PDE, such as the 
Schrodinger equation, for example, when the potential V(x) -> 0 as x -> 00 
in H — T + V. We find 1//00 when we set V — 0 in the Schrodinger equation. 
Another source for w (x) may be a nonzero angular momentum barrier of the 
form l(l + l)/.'/; 2 in a PDE or separated ODE that has a regular singularity and 
dominates at x -> 0. In such a case the indicial equation, such as Eqs. (8.65) or 
(8.88), shows that the wave function has x l as an overall factor. Since the wave 
function enters twice in matrix elements (v\Hu) and orthogonality relations 
(v\u), the weight functions in Table 9.1 come from these common factors in 
both radial wave functions. This is the physical reason why the exp(— x~) for 
Laguerre polynomials and x k exp(— x) for associated Laguerre polynomials 
in Table 9.1 arise and is explained in more detail in the next example. The 
mathematical reason is that the weight function is needed to make the ODE 
self-adjoint. 


EXAMPLE 9.1.1 


Asymptotic Behavior, Weight Function Let us look at the asymptotic 
forms for small and large r of the radial Schrodinger equation for a particle of 
mass m moving in a spherically symmetric potential V (r), 


/ h 2 d 2 
\ 2 mdr 2 


h 2 1(1 + 1 ) 

2m r 2 


+ V(r) - E 


ui(r ) = 0 , 


where Ri(r) = ui(f)/r is the radial wave function, E is the energy eigenvalue, 
and l is the orbital angular momentum (see Exercises 2.5.12 and 2.5.14). From 
the asymptotic ODEs we derive the asymptotic forms of the radial wave func¬ 
tion. The boundary conditions are that tq(0) = 0 and Ui(r) —> 0 for large r. 
Let us explain these boundary conditions for 1 = 0. We ask that uq —»■ 0 as 
r —»■ 0 because if Mo(0) diverged the wave function could not be normalized; 
that is, {u 0 \uq) = 1 cannot be enforced. If Vo(Q) is a finite nonzero constant, 
then Rd(r) ~ 1/r for r —> 0. In that case, the kinetic energy, ~ V 2 i ~ 5(r), 
would generate a singularity at the origin in the Schrodinger equation. 
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First, we explore how the angular momentum barrier ) affects the 

solution for r close to zero, when l ^ 0. We assume that the potential is no 
more singular than a Coulomb potential, that is, r 2 V (r) —0 as r —> 0, so that 
the potential and the energy eigenvalue are negligible compared to the angular 
momentum barrier for small r. Then the regular solution of the (asymptotic, 
meaning approximate for small r) radial ODE 

n 2 d 2 n 2 i(i +m _ 

2m dr 2 + 2m r 2 


is v,f (r) ~ r ,+1 for small r, the regular solution of this radial ODE, whereas the 
irregular solution Ui(r) ~ r~ l is not finite at small r and is not an acceptable 
solution. Thus, for r -* 0, Ihe radial wave function RRr) ~ r 1 due to the barrier. 

Now we turn our attention to large r -* oo, assuming V(r) —> 0, so that 
we have to solve the asymptotic ODE 


h 2 d 2 Ui(r ) 

2m dr 2 


= Eui(r ) 


because the angular momentum barrier is also negligible at large r. For bound 
states E < 0, the solution is Vi (r) ~ e~ Kr for large r with E = — Ti 2 k 2 /2m, 
whereas the other independent solution e Kr -*■ oo for large r. 

This settles the asymptotic behavior of the wave functions, which must 
have these limiting forms, r l at small r and e~ Kr at large r. In Chapter 13, the 
complete solution will be given in terms of associated Laguerre polynomials 
L(r ) so that R/(r) ~ r l e~ KV L(f). Therefore, orthonormality integrals (i//1 \j/) 
will contain the weight function r 2l + 2 e ~ 2Kr along with a product of Laguerre 
polynomials, as shown in Table 9.1, except for scaling 2 kt —> x and renaming 
21 + 2 —> k, and the corresponding ODEs are self-adjoint. ■ 


EXAMPLE 9.1.2 


Legendre’s Equation Legendre’s equation is given by 
(1 - x 2 )y" - 2 xy’ + n(n + 1 )y — 0, 


(9.9) 


over the interval — 1 < x < 1 and with boundary condition that y(± 1 ) is finite. 
From Eqs. (9.1), (9.8), and (9.9), 

Po(x ) = 1 — x 2 = p, w(x ) = 1, 

Pi(x) — —2x = p', X = n(n+ 1), 
p 2 (x) = 0 = q. 


Recall that our series solutions of Legendre’s equation (Exercise 8.5.5) 4 
diverged, unless n was restricted to an integer. This also represents a quanti¬ 
zation of the eigenvalue X. ■ 


4 Compare also Exercise 5.2.11. 
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When the equations of Chapter 8 are transformed into the self-adjoint form, we 
find the values of the coefficients and parameters (Table 9.1). The coefficient 
j)(x) is the coefficient of the second derivative of the eigen-function. The eigen¬ 
value X is the parameter that is available in a term of the form ),w(x)y(x); any 
x dependence apart from the eigenfunction becomes the weighting function 
w(pc). If there is another term containing the eigenfunction (not the deriva¬ 
tives), the coefficient of the eigenfunction in this additional term is identified 
as q (x). If no such term is present, q(x) is simply zero. 


EXAMPLE 9.1.3 


Deuteron Further insight into the concepts of eigenfunction and eigenvalue 
may be provided by an extremely simple model of the deuteron. The neutron- 
proton nuclear interaction is represented by a square well potential: V = Vq 
< 0 for 0 <r < a, V = 0 for r > a. The Schrodinger wave equation is 


ti 2 
2 M 


V 2 i/^ + V \[r = Ex/s, 


(9.10) 


where i jr = xjr (r) is the probability amplitude for finding a neutron-proton pair 
at relative distance r. The boundary conditions are xjr (0) finite and xjr(r) -> 0 
for large r. 

We may write u(r') = rxjr (r), and using Exercises 2.5.12 and 2.5.14 the radial 
wave equation becomes 

d 2 u 9 

— +k 2 u= 0, (9.11) 

with 

(9.12) 

h 2 


for the interior range, 0 < r < a. Here, M is the reduced mass of the neutron- 
proton system. Note that V 0 < E < 0 for a bound state, leading to the sign of 
k\ in Eq. (9.11). For a < r < oo, we have 


with 



— k\u = 0, 


(9.13) 


k\ = - 


2ME 


h 2 


> 0 


(9.14) 


because E < 0 for a bound state with V —> 0 as r -» oo. From the boundary 
condition that i jr remain finite,'«(()) = 0, and 


M in (r) = sinfcir, 0 <r < a. 


(9.15) 


In the range outside the potential well, we have a linear combination of the 
two exponentials, 


M ex (r) = /l exp kxr + Hexp(— k 2 r), a < r < oo. 


(9.16) 
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At r = a the solution to the ODE must be continuous, with a continuous 
derivative, demanding that U\ n (a) = u vx (a) and that u' in (a) = v! ex (a). These 

joining or matching conditions give 

sin kia = A exp k 2 a + B exp(— k 2 a), , 7 . 

ki cos kia = k 2 A exp k 2 d — k 2 B exp(— k 2 d). ^ J 


The boundary condition for large r means that A = 0. Dividing the preceding 
pair of equations (to cancel E), we obtain 


tan/qa = — — = —(9.18) 
k 2 V -E 

a transcendental equation for the energy E with only certain discrete solutions. 
If E is such that Eq. (9.18) can be satisfied, our solutions'« ;,,(?') and u ex (r ) can 
satisfy the boundary conditions. If Eq. (9.18) is not satisfied, no acceptable 
solution exists. The values of E, for which Eq. (9.18) is satisfied, are the 
eigenvalues; the corresponding function i//(r) = u in lr for r < a and i/r(r) = 
u ex /r for r > a is the eigenfunction. For the actual deuteron problem, there 
is one (and only one) negative value of E satisfying Eq. (9.18); that is, the 
deuteron has one and only one bound state. 

Now, what happens if E does not satisfy Eq. (9.18) (i.e., E ^ /?o is not 
an eigenvalue)? In graphical form, imagine that E and therefore k\ are varied 
slightly. For E — E i < Eq, k\ is reduced, and sink] a has not turned down 
enough to match exp(— k- 2 d). The joining conditions, Eq. (9.17), require A > 0 
and the wave function goes to +oo, exponentially. For E = E 2 > Eq, k\ is 
larger, sin k] a peaks sooner and has descended more rapidly at r — a. The 
joining conditions demand A < o, and the wave function goes to — oo expo¬ 
nentially. Only for E = Eq, an eigenvalue, will the wave function have the 
required negative exponential asymptotic behavior (Fig. 9.1). ■ 


Figure 9.1 

Deuteron Wave 
Functions; 
Eigenfunction for 
E — E 
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Boundary Conditions 


In the foregoing definition of eigenfunction, it was noted that the eigenfunction 
Ux.(x) was required to satisfy certain imposed boundary conditions. The term 
boundary conditions includes as a special case the concept of initial conditions, 
for instance, specifying the initial position xq and the initial velocity vq in some 
dynamical problem. The only difference in the present usage of boundary 
conditions in these one-dimensional problems is that we are going to apply 
the conditions on both ends of the allowed range of the variable to ensure a 
self-adjoint ODE. 

Usually, the form of the differential equation or the boundary conditions 
on the eigenfunctions will guarantee that, at the ends of our interval (i.e., at 
the boundary), the following products will vanish: 


p(.r)i>*(X) 


du(x ) 
dx 


= 0 


and 


p(aOi>*(;r) 


du{x) 

dx 


= 0 . 


(9.19) 


x=b 


Here, u(x) and v(x) are solutions of the particular ODE [Eq. (9.8)] being con¬ 
sidered. A reason for the particular form of Eq. (9.19) is suggested later. If we 
recall the radial wave function u of the hydrogen atom in Example 9.1.1 with 
u(0 ) = 0 and du/dr ~ e~ kr —*■ 0 as r -» oo, then both boundary conditions are 
satisfied. Similarly, in the deuteron Example 9.1.3, sin/qr —*■ 0 as r -» 0 and 
d(e~ k ' ir )/dr — > 0 as r — > oo, both boundary conditions are obeyed. 

We can, however, work with a less restrictive set of boundary conditions, 


V PM I x=a — ^ PU \x=bi 


(9.20) 


in which u{x) and v(x) are solutions of the differential equation corresponding 
to the same or to different eigenvalues. Equation (9.20) might well be sat¬ 
isfied if we were dealing with a periodic physical system, such as a crystal 
lattice. 

Equations (9.19) and (9.20) are written in terms of v*, complex conjugate. 
When the solutions are real, v — v* and the asterisk may be ignored. However, 
in Fourier exponential expansions and in quantum mechanics the functions 
will be complex and the complex conjugate will be needed. 


Hermitian Operators 


We now prove an important property of the combination self-adjoint, second- 
order differential operator [Eq. (9.6)], with functions v,(x) and v(x) that satisfy 
boundary conditions given by Eq. (9.20) and explain the special form of the 
latter. 

By integrating v* (complex conjugate) times the second-order self-adjoint 
differential operator C (operating on u) over the range a < x < b, we obtain 


pb pb pb 

/ v*Cudx= / v*Qpu'ydx+ / v*qudx 
J a J a J a 


(9.21) 
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using Eq. (9.6), or in Dirac notation, 


(v\Cu) = (r| 


( d d 

[di P di +q 


u). 


Integrating by parts, we have 


f 


v*(pu'ydx — v*pv! 




*pu'dx. 


(9.22) 


The integrated part vanishes on application of the boundary conditions [Eq. 
(9.20)]. Integrating the remaining integral by parts a second time, we have 



v'*pv!dx— —v'*pu 



(9.23) 


Again, the integrated part vanishes in an application of Eq. (9.20). A combina¬ 
tion of Eqs. (9.21)-(9.23) gives us 

(v\Cu) = f v 
J a 

This property, given by Eq. (9.24), is expressed by stating that the operator 
C is Hermitian with respect to the functions u(x) and v(x), which satisfy 
the boundary conditions specified by Eq. (9.20). Note that if this Hermitian 
property follows from self-adjointness in a Hilbert space, then it includes that 
boundary conditions are imposed on all functions of that space. The integral 
in Eq. (9.24) may also be recognized as inner product, (i>| Cu), of |i>) and \Cu). 

These properties [Eqs. (9.19) or (9.20)] are so important for the concept 
of Hermitian operator (discussed next) and the consequences (Section 9.2) 
that the interval (a, 6) must be so as to ensure that either Eq. (9.19) or Eq. 
(9.20) is satisfied. The boundary conditions of the problem determine the 
range of integration. If our solutions are polynomials, the coefficient p(x) may 
restrict the range of integration. Note that p(x) also determines the singular 
points of the differential equation (Section 8.4). For nonpolynomial solutions, 
for example, sin nx, cos nx; (p = 1), the range of integration is determined by 
the boundary conditions of each problem, as explained in the next example. 


*£udx= f uCv*dx — {Cv\u). (9.24) 


EXAMPLE 9.1.4 


Integration Interval, [a, ft] For C — d 2 /dx 2 a possible eigenvalue equation 
is 


with eigenfunctions 


d?_ 

dx 2 


y(x) + rPyipc) = 0, 


(9.25) 


= cos nx, v m — sin mx. 

Equation (9.20) becomes 

—nsmmxsmnx \ b a = 0, or mcosmxcosnx |* = 0, 
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interchanging u n and v m . Since sin mx and cos nx are periodic with period 2n 
(for n and m integral), Eq. (9.20) is clearly satisfied if a = xq and b = xq + 2jr. If 
a problem prescribes a different interval, the eigenfunctions and eigenvalues 
will change along with the boundary conditions. The functions must always 
be so that the boundary conditions [Eq. (9.20), etc.] are satisfied. For this case 
(Fourier series), the usual cases are xo — 0, leading to (0, 27r), and xo = — n 
leading to (—jr, jr). Here and throughout the following several chapters, 
the relevant functions will satisfy the boundary conditions prescribed by the 
integration interval [Eq. (9.20)]. The interval [a, b] and the weighting factor 
w(x) for the most commonly encountered second-order ODEs are listed in 
Table 9.2. ■ 


Table 9.2 


Equation 

a 

b 

w(x) 

Legendre 

-i 

1 

1 

Shifted Legendre 

0 

1 

1 

Associated Legendre 

-1 

1 

1 

Laguerre 

0 

OO 

e~ x 

Associated Laguerre 

0 

OO 

oc^@ ^ 

Itermite 

— OO 

OO 

e-* 2 

Simple harmonic oscillator 

0 

2tt 

1 


—7T 

71 

1 


x The orthogonality interval [a, 6] is detennined by the boundary 
conditions of Section 9.1. p(x), q{pc) are given in Table 9.1. 

2 The weighting function is established by putting the ODE in 
self-adjoint form. 


Hermitian Operators in Quantum Mechanics 


The preceding discussion focused on the classical second-order differential 
operators of mathematical physics. Generalizing our Hermitian operator 
theory, as required in quantum mechanics, we have an extension: The opera¬ 
tors need be neither second-order differential operators nor real. For example, 
the linear momentum operator p x = —ih('d/dx) represents a real physical ob¬ 
servable and will be an Hermitian operator. We simply assume (as is customary 
in quantum mechanics) that the wave functions satisfy appropriate boundary 
conditions in one or three (or other number of) dimensions, vanishing suffi¬ 
ciently strongly at infinity or having periodic behavior (as in a crystal lattice or 
unit intensity for waves). In practice, this means that the wave functions are 
in a given Hilbert space. The operator C is called Hermitian if 


Wi|£tfa) = J ^lCf 2 dr = J {Cf{)*f 2 dx = (9.26) 

for all t/q, i// 2 of a given Hilbert space. Apart from the simple extension to 
complex quantities, this definition is identical to Eq. (9.24). 
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The adjoint of an operator A is defined by 

(i/fi|A t ^2> = J lAf= J (Afifilr-idT = {A^i\^r 2 ). (9.27) 


Comparing with our classical, second derivative operator-oriented Eq. (9.5) 
defining £, we see t hat O = £* so that we have generalized the adjoint operator 
to the complex domain (of quantum mechanics). Here, the adjoint is defined 
in terms of the resultant integral, with the A' as part of the integrand. Clearly, 
if A — A* (self-adjoint) and the (space of) functions, on which it acts, satisfy 
the previously mentioned boundary conditions, then A is Hermitian. 

The expectation value of an operator £ is defined as 



yCEi/r dr = {\jr\Cijr). 


(9.28a) 


In the framework of quantum mechanics (£} corresponds to the theoretical 
value of the physical observable represented by £, if the physical system is in a 
state described by the wave function ijr . When this property is measured exper¬ 
imentally, (£) may be obtained as the mean or average of many measurements 
of the observable £ of the physical system in the state i J/. 

If we require £ to be Hermitian, it is easy to show that (£) is real (as would 
be expected from a measurement in a physical theory). Taking the complex 
conjugate of Eq. (9.28a), we obtain 



Rearranging the factors in the integrand, we have 


<£>* = 



xjfdx = (£l/r|l/f). 


Then, applying our definition of Hermitian operator [Eq. (9.26)], we get 


(£)* 


J f*Cfd 


r = (£>, 


(9.28b) 


or {£) is real. It is worth noting that i (r is not necessarily an eigenfunction of 
£. 


EXERCISES 

9.1.1 Show that Laguerre’s equation may be put into self-adjoint form by 
multiplying by e~ x and that w (x) = e~ x is the weighting function. 

9.1.2 Show that the Hermite equation may be put into self-adjoint form by 
multiplying by er ' ~ and that this gives w (x) = er J ~ as the appropriate 
density function. 

9.1.3 Show the following when the linear second-order differential equation 
is expressed in self-adjoint form: 
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(a) The Wronskian is equal to a constant divided by the initial coeffi¬ 
cient p. 

C 

W[y u y 2 ) = , v 

pipe ) 

(b) A second solution is given by 

f x dt 

2 / 2 Or) = Cyiipc) / ■ 

J p[y\(t)Y 


9.1.4 For the very special case A = 0 and q (pc) = 0, the self-adjoint eigenvalue 
equation becomes 


d 

dx 


pipe) 


du(x) 

dx 


= 0 , 


satisfied by 


du 1 
dx p{pc)' 


Use this to obtain a “second” solution of the following: 

(a) Legendre’s equation; 

(b) Laguerre’s equation; and 

(c) Hermite’s equation. 

1 OC 

ANS. (a) u 2 (x) = -In--, 

2 1 — x 

r x dt, 

(b) u 2 (x) - u 2 (xo) = I e l 

Joe o £ 

C X 2 

(c) u 2 (pc) = / dt. 

Jo 

These second solutions illustrate the divergent behavior usually found 
in a second solution. 

Note. In all three cases Ui(pc) = 1. 


9.1.5 Given that Cu = 0 and gCu is self-adjoint, show that for the adjoint 
operator C, C{gu) = 0. 


9.1.6 For a second-order differential operator C that is self-adjoint show that 

f b 

(V 2 \Cyi) - {yi\Cy 2 } = / [yzCyi - yiCy 2 \dx = p(y[y 2 - 2/12/0 la ■ 

J a 


9.1.7 Show that if a function i// is required to satisfy Laplace’s equation in a 
finite region of space and to satisfy Dirichlet boundary conditions over 
the entire closed bounding surface, then x/s is unique. 

Hint. One of the forms of Green’s theorem (Section 1.10) will be helpful. 


9.1.8 Consider the solutions of the Legendre, Hermite, and Laguerre equa¬ 
tions to be polynomials. Show that the ranges of integration that guar¬ 
antee that the Hermitian operator boundary conditions will be satisfied 
are 

(a) Legendre [—1, 1], (b) Hermite (—oo, oo), (c) Laguerre [0, oo). 
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9.1.9 Within the framework of quantum mechanics [Eq. (9.26) and folio wing], 

show that the following are Hermitian operators: 

h 

(a) momentum p = —ihV = —i —V;and 

2jt 

h 

(b) angular momentum L = —ihr x V = — i —rxV. 

27T 

Hint. In Cartesian form L is a linear combination of noncommuting 
Hermitian operators. 

9.1.10 (a) A is a non-Hennitian operator. In the sense of Eqs. (9.26) and (9.27), 
show that 


A + and i(A — A*) 

are Hermitian operators. 

(b) Using the preceding result, show that every non-Hermitian operator 
may be written as a linear combination of two Hermitian operators. 

9.1.11 U and V are two arbitrary operators, not necessarily Hermitian. In the 
sense of Eq. (9.27), show that 

(UVj = V t «7 t . 

Note the resemblance to Hermitian adjoint matrices. 

Hint. Apply the definition of adjoint operator [Eq. (9.27)]. 

9.1.12 Prove that the product of two Hermitian operators is Hermitian [Eq. 
(9.26)] if and only if the two operators commute. 

9.1.13 A and B are noncommuting quantum mechanical operators: 


AB — BA — iC. 


Show that C is Hermitian. Assume that appropriate boundary condi¬ 
tions are satisfied. 

9.1.14 The operator C is Hermitian. Show that (£ 2 ) > 0. 

9.1.15 A quantum mechanical expectation value is defined by 



i fr*{x)A^{x)dx = {\lf\A\jr), 


where A is a linear operator. Show that demanding that (A) be real 
means that A must be Hermitian with respect to 

9.1.16 From the definition of adjoint [Eq. (9.27)], show that A^ = A in the 
sense that/ ^A^fadr = f xj/^A^dr. The adjoint of the adjoint is the 
original operator. 

Hint. The functions i/h and ^2 of Eq. (9.27) represent a class of func¬ 
tions. The subscripts 1 and 2 may be interchanged or replaced by other 
subscripts. 
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9.1.17 For a quantum particle moving in a potential well, V(x) = \mio 2 x 2 , t he 
Schrodinger wave equation is 


or 


h 2 d 2 \p(x) 
2 to dx 2 


-mw 2 x 2 xp(x') = Exp (pc) 

Li 


d 2 xp(z) 

dz? 


9 2 E 

~ z f(z) = - — 

hco 


^0), 


where z — (rno>/h) ]/2 x. Since this operator is even, we expect solutions 
of definite parity. For the initial conditions that follow, integrate out 
from the origin and determine the minimum constant 2E/hco that will 
lead to xp (oo) = 0 in each case. (You may take z = 6 as an approximation 
of infinity.) 

(a) For an even eigenfunction, 

xp(0) = 1, f (0) = 0. 

(b) For an odd eigenfunction, 

iK 0) = o, y/( 0) = l. 


Note. Analytical solutions appear in Section 13.1. 


9.2 Hermitian Operators 


Hermitian or self-adjoint operators with appropriate boundary conditions have 
the following properties that are of extreme importance in classical and quan¬ 
tum physics: 


1. The eigenvalues of an Hermitian operator are real. 

2. The eigenfunctions of an Hermitian operator are orthogonal. 

3. The eigenfunctions of an Hermitian operator form a complete set, meaning 
that under suitable conditions a function can be expanded in a series of 
eigenfunctions. 5 


Real Eigenvalues 


We proceed to prove the first two of these three properties. Let 


Cui + XiWUi = 0, or C\Ui) + Xiw\Ui) = 0. 


Assuming the existence of a second eigenvalue and eigenfunction 
Cuj + XjwUj = 0, or C\Uj) + Xjw\Uj) = 0. 


(9.29) 

(9.30) 


B This property is not universal. It does hold for our linear, second-order differential operators 
in Sturm-Liouville (self-adjoint) form. Completeness is defined and discussed in more detail in 
Section 9.4. A proof that the eigenfunctions of our linear, second-order, self-adjoint, differential 
equations form a complete set may be developed from the calculus of variations of Section 18.6. 
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Then, taking the Hermitian adjoint, using = C we obtain 

Cu* + X*wu* = 0, or {Uj\£ + {Uj\X*w = 0, (9.31) 

where p and q are real functions of x, and w(x) is a real function. However, we 
permit /./•, the eigenvalues, andthe eigenfunctions, to be complex. Multiply¬ 
ing Eq. (9.29) by u*, (or {v,j \) and Eq. (9.31) by Ui (or '»,,-)) and then subtracting, 
we have 


u*jCui - UiCu*j = (A*- - Xi)wUiU*j. 


We integrate over the range a < x <b, 


pb pb pb 

/ u*Cuidx— / UiCu*dx = (A* — Aj) / UiU*wdx, 
J a J a J a 


(9.32) 


(9.33) 


or in Dirac notation, 


(Uj\Cui) - ( Cuj\Ui) = (A* - Xi){Uj\Ui). 

Since C is Hermitian, the left-hand side vanishes by Eq. (9.27) and 

(A* - A j) f UiU*w dx = (A* - Xi){Uj\Ui) = 0. (9.34) 

Ja 

If i = j , the integral cannot vanish [ w(x) > 0, apart from isolated points], 
except in the trivial case Ui = 0. Hence, the coefficient (A* — A j) must be zero, 

A* = Aj, (9.35) 


which states that the eigenvalue is real. Since A j can represent any one of 
the eigenvalues, this proves the first property. This is an exact analog of the 
nature of the eigenvalues of real symmetric (and Hermitian) matrices (compare 
Sections 3.3 and 3.4). 

Real eigenvalues of Hermitian operators have a fundamental significance 
in quantum mechanics. In quantum mechanics the eigenvalues correspond 
to observable (precisely measurable or sharp) quantities, such as energy and 
angular momentum. When a single measurement of an observable C is made, 
the result must be one of its eigenvalues. With the theory formulated in terms 
of Hermitian operators, this proof of real eigenvalues guarantees that the the¬ 
ory will predict real numbers for these measurable physical quantities. In 
Section 18.6, it will be seen that for some operators, such as Hamiltonians, 
the set of real eigenvalues has a lower bound. Physically important Hermitian 
operators are real potentials V* = V and the momentum operator —id/dx. 
The latter is Hermitian because upon using integration by parts and discarding 
the integrated term, we have 


= / ’/»;( - = -”lvlrX.„ + ' 



f 


d^\ 

dx 


i fodx 


Vh 


,d^2 

i - 

dx 
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Orthogonal Eigenfunctions 


If we take i =£■ j and if X ^ Xj in Eq. (9.34), the integral of the product of the 
two different eigenfunctions must vanish: 





(9.36) 


This condition, called orthogonality, is the continuum analog of the van¬ 
ishing of a scalar (or inner) product of two vectors. 6 We say that the 
eigenfunctions Ui(x) and Uj(x) are orthogonal with respect to the weighting 
function w(x) over the interval [a, 6], Equation (9.36) constitutes a partial 
proof of the second property of our Hermitian operators. Again, the precise 
analogy with matrix analysis should be noted. Indeed, we can establish a one- 
to-one correspondence between this Sturm-Liouville theory of differential 
equations and the treatment of Hermitian matrices. Historically, this corre¬ 
spondence has been significant in establishing the mathematical equivalence 
of matrix mechanics developed by Heisenberg and wave mechanics developed 
by Schrodinger. Today, the two diverse approaches are merged into the the¬ 
ory of quantum mechanics, and the mathematical formulation that is more 
convenient for a particular problem is used for that problem. Actually, the 
mathematical alternatives do not end here. Integral equations form a third 
equivalent and sometimes more convenient or more powerful approach. Sim¬ 
ilarly, any two functions u, v, not necessarily eigenfunctions, are orthogonal if 
(i>| u) — v*uwdx = 0. 

This proof of orthogonality is not quite complete. There is a loophole 
because we may have u -,, ^ Uj but still have A.* = Xj. Such eigenvalues are 
labeled degenerate. Illustrations of degeneracy are given at the end of this 
section. If Xi = Xj, the integral in Eq. (9.34) need not vanish. This means that 
linearly independent eigenfunctions corresponding to the same eigenvalue are 
not automatically orthogonal and that some other method must be sought to 
obtain an orthogonal set. Although the eigenfunctions in this degenerate case 
may not be orthogonal, they can always be made orthogonal. One method 
is developed in the next section. See also the discussion after Eq. (4.13) for 
degeneracy due to symmetry. 

We shall see in succeeding chapters that it is just as desirable to have a given 
set of functions orthogonal as it is to have an orthogonal coordinate system. 
We can work with nonorthogonal functions, but they are likely to prove as 
messy as an oblique coordinate system. 


6 From the definition of Riemann integral, 



where xo = a, agy = b, and Xi — Xi-\ = Ax. If we interpret f(pci) and g(Xi) as the ith components 
of an N component vector, then this sum (and therefore this integral) corresponds directly to 
a scalar product of vectors, Eq. (1.11). The vanishing of the scalar product is the condition for 
orthogonality of the vectors—or functions. 
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EXAMPLE 9.2.1 


Fourier Series—Orthogonality To continue Example 9.1.4 with the inter¬ 
val —n < x < 7T, the eigenvalue equation [Eq. (9.25)], 

d 2 o 

— 1 ,y(x) + rty(x') = 0, 
dx i 

subject to Eq. (9.20), may describe a vibrating violin string with eigen¬ 
functions—sin nx subject to the boundary conditions sinfinir) = 0 so that n 
is an integer, and the orthogonality integrals become 


rxo+n 

(a) / smmxsinnxdx = C n S nm , Xq = 0. 

J Xq—TT 

For an interval of length 2 n the preceding analysis guarantees the Kronecker 
delta in (a). Our Sturm-Liouville theory says nothing about the values of C n . 

Similarly, a quantum mechanical particle in a box may have eigenfunctions 
cos nx subject to the boundary conditions dc ^ x \ x=± „ = 0 giving integer n 
again. Then 


rXo+x 

(b) / cos mx cos nxdx = D n 8 nm , .r 0 = 0, 

J XQ— 7 T 

where D n remains undetermined. Actual calculation yields 


C n = 


tt, n ^ 0, 
[0, n= 0, 

Finally, inspection shows that 


D n = 


Tt, 0, 

27r, n = 0. 


(c) 


rxo+n 

I 1 

J Xft— 7 T 


sin mx cos nx dx = 0 


always vanishes for all integral m and n. 


Expansion in Orthogonal Eigenfunctions 


Stalling from some Hamiltonian and its eigenvalue equation — E\ijr), we 
determine the set of eigenvalues Ej and eigenfunctions | tpj) taking the latter 
to be orthonormal; that is, 


{Vk\ <Pj) — &jk- 

The property of completeness of the set \<Pj) means that certain classes of 
(e.g., sectionally or piecewise continuous) functions may be represented by 
a series of orthogonal eigenfunctions to any desired degree of accuracy. We 
now assume | t/t > is in that class and expand it as 

|i/r> = Y^ajWj). 

3 

We determine the coefficient by projection 
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EXAMPLE 9.2.2 


Calling {<pk\H(pj) = Hj c j the matrix elements of the Hamiltonian, we have 
the eigenvalue equations 


^ ' HkjO>j — E(l; ■ 


(9.37) 


from which the column vector of admixture coefficients a ; - may be determined, 
along with the eigenvalue E. This usually infinite set of linear equations is 
truncated in practice. 

The choice of eigenfunction is made on the basis of convenience. To 
illustrate the expansion technique, let us choose the eigenfunctions of 
Example 9.2.1, cos nx and sin nx. The eigenfunction series is conveniently 
(and conventionally) written as the Fourier series 


a 0 ^ ^ 

fix) — — + 2_^{a n cos nx + b n smnx). 

C-J 


n =1 


From the orthogonality integrals of Example 9.2.1 the coefficients are given 
by 


_ 1 r 

JT J 7 


fit ) cos nt dt, b n = 


_ 1 r 

* J — 7i 


f(t) sinnt dt, 


n — 0, 1, 2... . 


Square Wave Now consider the square wave 


/ 0*0 = 


h n 

—, 0 < X < 7T, 

Li 


h 

2’ 


-n < x < 0. 

Direct substitution of ±h /2 1'or /'(<) yields 

a n = 0, 

which is expected because of the antisymmetry, /(—.%') = — fix'), and 


h 

b n = —(1 — cosnn) = 
mt 


0, n even, 

2h an 
—, modd. 

nn 


Hence, the eigenfunction (Fourier) expansion of the square wave is 

,, ^ 2 h ^ sin(2n+ \)x 

= 2» +1 - 
n =0 


(9.38) 


Additional examples, using other eigenfunctions, appear in Chapters 11-13. 
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Degeneracy 


The concept of degeneracy was introduced earlier. If N linearly independent 
eigenfunctions correspond to the same eigenvalue, the eigenvalue is said to 
be iV-fold degenerate. A particularly simple illustration is provided by the 
eigenvalues and eigenfunctions of the linear oscillator equation of classical 
mechanics (Example 9.2.1). For each value of the eigenvalue to, there are two 
possible solutions: sin nx and cos nx (and any linear combination). We may 
say the eigenfunctions are degenerate or the eigenvalue is degenerate. 

When an underlying symmetry, such as rotational invariance, is causing the 
degeneracies, states belonging to the same energy eigenvalue will then form 
a multiplet or representation of the symmetry group. The powerful group- 
theoretical methods are treated in Chapter 4 in detail. 

In Section 9.3, we show an alternative method of how such functions may 
be made orthogonal. 


Biographical Data 

Sturm, Jacques Charles. Sturm, who was born in 1803 in Geneva, Switzer¬ 
land and died in 1855, was Poisson’s successor at the Sorbonne and worked 
with his friend Liouville on heat flow, from which the eigenvalue problems 
arose now named after both. 

Liouville, Joseph. Liouville (1809-1882), a professor at the College de 
France, made contributions to elliptic functions, analytic functions, and 
quadratic forms. 


EXERCISES 

9.2.1 The functions u\ (x) and u 2 {x) are eigenfunctions of the same Hermitian 
operator but for distinct eigenvalues X ] and X 2 . Show that u\ ( x ) and u 2 {pc') 
are linearly independent. 

9.2.2 (a) The vectors e n are orthogonal to each other: e„ • e m = 0 for to to. 

Show that they are linearly independent. 

(b) The functions i/ n (x) are orthogonal to each other over the interval 
[a, b] and with respect to the weighting function w(x). Show that the 
ijs n (x) are linearly independent. 

9.2.3 Given that 



are solutions of Legendre’s differential equation corresponding to differ¬ 
ent eigenvalues: 

(a) Evaluate their orthogonality integral 
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(b) Explain why these two functions are not orthogonal, that is, why the 
proof of orthogonality does not apply. 

9.2.4 To(x) = 1 and V\ (x) = (1 — ,x 2 ) 1/2 are solutions of the Chebyshev differ¬ 
ential equation corresponding to different eigenvalues. Explain, in terms 
of the boundary conditions, why these two functions are not orthogonal. 

9.2.5 (a) Show that the first derivatives of the Legendre polynomials satisfy a 

self-adjoint differential equation with eigenvalue X = n(n+ 1) — 2. 
(b) Show that these Legendre polynomial derivatives satisfy an orthog¬ 
onality relation 



Note. In Section 11.5, (1 — x 2 f 22 P'^x) will be labeled an associated 
Legendre polynomial, P ( ] (x). 

9.2.6 A set of functions u n (x) satisfies the Sturm-Liouville equation 


p(x)—U n (x) + XnWix^Unipc) = 0. 


dx dx 


The functions u m (x) and u m (x) satisfy boundary conditions that lead to 
orthogonality. The corresponding eigenvalues X m and X n are distinct. 
Prove that for appropriate boundary conditions v! m (x) and v' n (x) are 
orthogonal with p{pc) as a weighting function. 

9.2.7 A linear operator A has n distinct eigenvalues and n corresponding eigen¬ 
functions: Ai/q = Iji/fj. Show that the n eigenfunctions are linearly inde¬ 
pendent. A is not necessarily Henuitian. 

Hint.. Assume linear dependence—that \l/ fl = Yx=\ Use this rela¬ 
tion and the operator-eigenfunction equation first in one order and then 
in the reverse order. Show that a contradiction results. 


9.2.8 With C not self-adjoint, C ^ £, 


Cui + XiWUi = 0 


and 


Cvj + XjWVj = 0 . 


(a) Show that 



r b _ 

VjCuidx = / UiCvjdx, 


UiPoVj\ a = vjpoi 4| a 


|b , \b 


provided 
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and 


Ui(j>i - p'o)Vj\ b a = 0 . 


(b) Show that the orthogonality integral for the eigenfunctions u -,, and 
vj becomes 



UiVjW dx = 


0 


(Xi ^ Aj). 


9.3 Gram-Schmidt Orthogonalization 


The Gram-Schmidt orthogonalization is a method that takes a nonorthogo- 
nal set of linearly independent functions 7 and constructs an orthogonal set 
over an arbitrary interval and with respect to an arbitrary weight or density 
factor w that may or may not originate from our basic Eq. (9.8). The choice 
of these weights gives a particular set of orthonormal functions in the end 
(orthogonal plus unit normalization). In the language of linear algebra, the 
process is equivalent to a matrix transformation relating an orthogonal set 
of basis vectors (functions) to a nonorthogonal set. The functions involved 
may be real or complex. Here, for convenience they are assumed to be real. 
The generalization to more than one dimension or to complex cases offers no 
difficulty. 

Before taking up orthogonalization, we should consider normalization of 
functions. So far, no normalization has been specified. This means that 

f b ■ 9 

(<Pi\<Pi) = / yfwdx=N,f, 

J a 

but no attention has been paid to the value of A). We now demand that each 
function (pi(x) be multiplied by 1 so that the new (normalized) <pi will satisfy 

r b 

{(PilVi) = / (pl(x)w(x)dx = 1 (9.39) 

J a 

and 

r b 

{(Pj\<Pi) = / <pi(x)<pj(x)w(x)dx = Sjj. (9.40) 

Ja 

Equation (9.39) states that we have normalized to unity. Including the prop¬ 
erty of orthogonality, we have Eq. (9.40). Functions satisfying this equation 
are said to be orthonormal (orthogonal plus unit normalization). Other nor¬ 
malizations are certainly possible, and indeed, by historical convention, each 
of the special functions of mathematical physics treated in Chapters 12 and 13 
will be normalized differently. 


7 Such a set of functions might well arise from the solutions of a PDE, in which the eigenvalue was 
independent of one or more of the constants of separation. Note, however, that the origin of the 
set of functions is irrelevant to the Gram-Schmidt orthogonalization procedure. 
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We consider three sets of functions: an original, linearly independent given 
set u n Or), n= 0, 1, 2, ..an orthogonalized set ij/ n (x) to be constructed; and a 
final set of functions cp n (x) that are the normalized i J/„. The original u„ may be 
degenerate eigenfunctions, but this is not necessary. We shall have 


U n {x) 

VXiO) 

Xn(x) 

Linearly independent 

Linearly independent 

Linearly independent 

Nonorthogonal 

Orthogonal 

Orthogonal 

Unnormalized 

Unnormalized 

Normalized 



(Orthonormal) 


The Gram-Schmidt procedure takes the uth i// function, i// n , to be u n (x) 
plus an unknown linear combination of the previous <p. The presence of the 
new u n (x) will guarantee linear independence. The requirement that \l/ n (x) 
be orthogonal to each of the previous tp yields just enough constraints to 
determine each of the unknown coefficients. Then the fully determined i jr n will 
be normalized to unity, yielding <p n (x). Then the sequence of steps is repeated 
for fn+i(_X). 

We start with n= 0, letting 


fo(x) = U()(x) 

with no “previous” <p to worry about. Then normalize 

fo(x) \jjfo) 


<Po(x) = 


[fa fl W dx \ V2 ’ 


or |<p 0 > = 




(9.41) 


(9.42) 


Form= 1, let 


= Ml Or) + a lo <po0r). 


(9.43) 


We demand that Vi 0*0 be orthogonal to <^o(.x'). [At this stage the nonnalization 
of i)i(r) is irrelevant.] This demand of orthogonality leads to 


f b f b f 2 

/ \jf\(pQwdx= I Ui<pow dx + aio / ip Q w dx — 0. (9. 

J a Ja J 


44) 


Since (po is normalized to unity [Eq. (9.42)], we have 

pb 

Oio 


- 

J a 


Ui(p 0 w dx = —{(po\ui), (9.45) 

fixing the value of U\\). In Dirac notation we write Eq. (9.43) as 

IV*1> = |Mi> - (‘Po|Mi>|<po> (9.43a) 

and Eq. (9.44) as 


0 = (ipolVa} = (<Po|mi> + aio(v?ol<Po}- 


(9.44a) 
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In this form we recognize that the coefficient a 10 is determined by projection, 
similar to expanding a vector u in terms of Cartesian coordinate or basis 
vectors x; as 


u = (u Xi)Xi + (u • x 2 )x 2 + (u • x 3 )x 3 = ^2 Ui±i- 

i 

Normalizing, we define 




Vh(X> 


(f t/rfw dx ) 1/ ‘ J 
Finally, we generalize so that 


(Pi(x) = 


or |<pi> = 


iA; 0*0 


m 


[(^il^i>] 1/2 ‘ 


(/ i/ff(x)w(x)dx) 


1/2 ’ 


where 


— Ui + a i0 (p 0 + Oi\<p\ + • • • + 


(9.45a) 


(9.46) 


(9.47) 


(9.48) 


The coefficients % are again given by projection (using orthogonality) 

a-jj = — J UicpjW dx = —{<pj\Ui). (9.49) 

Equation (9.49) holds for unit normalization. If some other normalization 
is selected, 

eb 


! 

J a 


[(Pjipc^fwQx^dx = Nj, 


then Eq. (9.47) is replaced by 

<Pi(x) = Ni 

and a,;j becomes 

an = - 




{f ffwdx) 1 ' 2 
f Ui^jW dx 


(9.47a) 


(9.49a) 


Equations (9.48) and (9.49) may be rewritten in terms of projection 
operators, Pj. If we consider the <p n (x) to form a linear vector space, then 
the integral in Eq. (9.49) may be interpreted as the projection of Ui into the (pj 
“coordinate” or the jth component of u-,,. With 


PjUi(x ) = 




that is, Pj = \<Pj}(<Pj\. Equation (9.48) becomes 


i -1 


=jl-E Pj j UiQx). 


(9.48a) 
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Subtracting off the jX h components, j = 1 to i — 1, leaves ifi(x) orthogonal to 
all the <pj(x). 

Note that although this Gram-Schmidt procedure is one possible way of 
constructing an orthogonal or orthonormal set, the functions <Pi(x) are not 
unique. There is an infinite number of possible orthonormal sets for a given 
interval and a given density function. As an illustration of the freedom involved, 
consider two (nonparallel) vectors A and B in the xy- plane. We may normalize 
A to unit magnitude and then form B' = aA + B so that B' is perpendicular to 
A. By normalizing B' we have completed the Gram-Schmidt orthogonalization 
for two vectors. However, any two perpendicular unit vectors, such as x and y, 
could have been chosen as our orthonormal set. Again, with an infinite number 
of possible rotations of x and y about the 2 -axis, we have an infinite number 
of possible orthonormal sets. 


EXAMPLE 9.3.1 


Legendre Polynomials by Gram-Schmidt Orthogonalization Let us 

form an orthonormal set from the set of functions u n (x) = x n , n = 0, 1, 2 .... 
The interval is — 1 < x < 1 and the density function is w(x) = 1. 

In accordance with the Gram-Schmidt orthogonalization process des¬ 
cribed, 


Then 


and 


u 0 = 1, 


hence 



Vo 0*0 = x+a w —^ 


«io 



x 

—~dx = 0 

V2 


by symmetry. We normalize \j/\ to obtain 


(9.50) 


(9.51) 


(9.52) 





Then continue the Gram-Schmidt process with 


(9.53) 


? 1 

\// 2 (x) = X + a 2 i) -j= + 021 J 


(9.54) 


[ l x 2 J -s/2 

J- 1 V2 3 

(9.55) 

f 1 [3 n 

- / J -x dx = 0, 

J-i V 2 

(9.56) 


where 
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again by symmetry. Therefore, 



(9.57) 


and, on normalizing to unity, we have 


<p 2 (x) = 





(9.58) 


The next function becomes 


(p3 0*0 = 



\ • ^x 3 - 3xl 


(9.59) 


Reference to Chapter 11 will show that 


(Pnix) = 



(9.60) 


where P n (x) is the nth-order Legendre polynomial. Our Gram-Schmidt process 
provides a possible, but very cumbersome, method of generating the Legendre 
polynomials. It illustrates how a power series expansion in u n (x) = x n , which 
is not orthogonal, can be converted into an orthogonal series over the finite 
interval [—1, 1], 

The equations for Gram-Schmidt orthogonalization tend to be ill condi¬ 
tioned because of the subtractions. A technique for avoiding this difficulty 
using the polynomial recurrence relation is discussed by Hamming. 8 

In Example 9.3.1, we have specified an orthogonality interval [—1, 1], a unit 
weighting function, and a set of functions, x n , to be taken one at a time in 
increasing order. Given all these specifications the Gram-Schmidt procedure 
is unique (to within a normalization factor and an overall sign as discussed sub¬ 
sequently). Our resulting orthogonal set, the Legendre polynomials, 1), through 
P n , forms a complete set for the description of polynomials of order <n over 
[—1, 1]. This concept of completeness is taken up in detail in Section 9.4. 
Expansions of functions in series of Legendre polynomials are discussed in 
Section 11.3. ■ 


Orthogonal Polynomials 


The previous example was chosen strictly to illustrate the Gram-Schmidt pro¬ 
cedure. Although it has the advantage of introducing the Legendre polynomi¬ 
als, the initial functions u n = x n are not degenerate eigenfunctions and are 
not solutions of Legendre’s equation. They are simply a set of functions that 
we have rearranged to create an orthonormal set for the given interval and 


“Hamming, R. W. (1973). Numerical Methods for Scientists and Engineers, 2nd ed. McGraw-Hill, 
New York. See Section 27.2 and references given there. 
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Table 9.3 

Orthogonal Polynomials 
Generated by 
Gram-Schmidt 
Orthogonalization of 
u n (x ) = x n ,n — 

0 , 1 , 2 ,... 


Polynomials 

Interval 

Weighting 
Function ir(x) 

Standard Normalization 

Legendre 

—1 < x < 1 

1 

r 1 2 

JJPn(x)] 2 dx= ^ +1 

Shifted Legendre 

0 < x < 1 

1 


Laguerre 

0 < x < oo 

e -X 

roo 

1 [L n (pc)] 2 e~ x dx = 1 
^0 

Associated Laguerre 

0 < x < oo 


r[Ll(x)fx«e-*dx= (ra+ , fc)! 

Jo n\ 

Hermite 

—OO < X < oo 

e-* 2 

poo 

/ [H n (x)] 2 e~ x ~dx = 2 n jr 1/2 n! 

J — OO 


given weighting function. The fact that we obtained the Legendre polynomi¬ 
als is not quite magic but a direct consequence of the choice of interval and 
weighting function. The use of u n (pc ) = x n , but with other choices of interval 
and weighting function or a different ordering of the functions, leads to other 
sets of orthogonal polynomials as shown in Table 9.3. We consider these poly¬ 
nomials in detail in Chapters 11 and 13 as solutions of particular differential 
equations. 

An examination of this orthogonalization process reveals two arbitrary 
features. First, as emphasized previously, it is not necessary to normalize the 
functions to unity. In the example just given, we could have required 

r 1 2 

/ (,p n (x)<p m (x)dx = — 

^nmi (9.61) 

J -1 2n+ 1 

and the resulting set would have been the actual Legendre polynomials. 
Second, the sign of cp n is always indeterminate. In the example, we chose the 
sign by requiring the coefficient of the highest power of x in the polynomial 
to be positive. For the Laguerre polynomials, on the other hand, we would 
require the coefficient of the highest power to be (— Y) n /n\. 


EXERCISES 

9.3.1 Rework Example 9.3.1 by replacing (p„(x) by the conventional Legendre 
polynomial, 

J_lPnWfdX= 

Using Eqs. (9.47a) and (9.49a), construct P 0 , P\ (x), and P>(x). 


ANS. P 0 = 1, P x = x, P 2 = far - i. 
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9.3.2 Following the Gram-Schmidt procedure, construct a set of polynomials 
P*(x) orthogonal (unit weighting factor) over the range [0, 1] from the 
set [1, x\. Normalize so that P*(l) = 1. 

ANS. P*(x) = 1, 

P{(x) = 2x—l, 

P*{x) = §x 2 — 6a; + 1, 

P*(x) = 20;r ! - 30a; 2 + 12.x - 1. 

These are the first four shifted Legendre polynomials. 

Note. The asterisk is the standard notation for “shifted”: [0, 1] instead of 
[—1, 1]. It does not mean complex conjugate. 

9.3.3 Apply the Gram-Schmidt procedure to form the first three Laguerre 
polynomials, using 

u n (x) = x n , n = 0, 1, 2,..., 0 < x < oo, wipe) = e~ x . 

The conventional normalization is 

/»oo 

(L m \L n ) — / L m (x)L n {pc)e dx = S mn . 

Jo 

9 _ 4-T -J- 

ANS. L 0 = 1, L 1 = ( l-x), L 2 = - 

9.3.4 You are given 

(a) a set of functions u H (x) = x n ,n= 0, 1, 2,..., 

(b) an interval (0, oo), and 

(c) a weighting function w(x) — xe~ x . Use the Gram-Schmidt procedure 
to construct the first three orthonormal functions from the set 
ii n (x) for this interval and this weighting function. 

ANS. (p 0 (x ) = 1, (pi(x) = (x — 2)/a/2, <&(.%) — (% 2 — 6 x + 6)/2V3. 

9.3.5 Using the Gram-Schmidt orthogonalization procedure, construct the 
lowest three Hermite polynomials, using 

u n (x~) = x n , n = 0, 1, 2,..., —oo < x < oo, w(x) = . 

For this set of polynomials the usual normalization is 

/ OO 

H m (x) H n (x) w (x) dx = S mn 2 m m\7T 1 ^ 2 . 

-OO 

ANS. H 0 = 1, H = 2.x, H, = 4.x 2 - 2. 

9.3.6 As a modification of Exercise 9.3.5, apply the Gram-Schmidt orthogo¬ 
nalization procedure to the set u n (x) = x n , n = 0, 1, 2,..., 0 < x < oo. 
Take w(x) to be exp[— x 2 ]. Find the first two nonvanishing polynomials. 
Normalize so that the coefficient of the highest power of x is unity. In 
Exercise 9.3.5, the interval (—oo, oo) led to the Hermite polynomials. 
These are certainly not the Hermite polynomials. 

ANS. <po — 1, (pi = x— ? r -1/2 . 
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9.3.7 Form an orthogonal set over the interval 0 < x < oo, using u n (x) = 

e~ nx , n = 1, 2, 3,_Take the weighting factor, iv(x), to be unity. These 

functions are solutions of it" — n 2 u n = 0, which is clearly already in 
Sturm-Liouville (self-adjoint) form. Why doesn’t the Sturm-Liouville 
theory guarantee the orthogonality of these functions? 


9.4 Completeness of Eigenfunctions 


The third important property of an Hermitian operator is that its eigenfunctions 
form a complete set. This completeness means that any well-behaved (at least 
piecewise continuous) function Fix) can be approximated by a series 

OO 

F(x) = ^ a n (p n (x ) (9.62) 

n =0 


to any desired degree of accuracy. 9 More precisely, the set (p n (x) is called 
complete 10 if the limit of the mean square error vanishes: 


lim 

TO-^OO 


f 

J a 


n 2 


F{pc) - ^2 a-nVnix) 


n =0 


w[x)dx = 0. 


(9.63) 


We have not required that the error vanish identically in [a, b] but only that 
the integral of the error squared go to zero. This convergence in the mean 
[Eq. (9.63)] should be compared with uniform convergence [Section 5.5, 
Eq. (5.43)]. Clearly, uniform convergence implies convergence in the mean, 
but the converse does not hold; convergence in the mean is less restrictive. 
Specifically, Eq. (9.63) is not upset by piecewise continuous functions with 
only a finite number of finite discontinuities. Equation (9.63) is perfectly ad¬ 
equate for our purposes and is far more convenient than Eq. (5.43). Indeed, 
since we frequently use expansions in eigenfunctions to describe discontinu¬ 
ous functions, convergence in the mean is all we can expect. 

In the language of linear algebra, we have a linear space, a function vec¬ 
tor space. The linearly independent, orthonormal functions (p n ix) form the 
basis for this (infinite-dimensional) space. Equation (9.62) is a statement that 
the functions (p n ix) span this linear space. With an inner product defined by 
Eq. (9.36), our linear space is a Hilbert space spanned by the complete set 
of basis states q) n (x); it contains all square-integrable functions F that can be 
expanded in the sense of Eq. (9.63). 

The question of completeness of a set of functions is often determined by 
comparison with a Laurent series (Section 6.5). In Section 14.1, this is done 
for Fourier series, thus establishing the completeness of Fourier series. For 
all orthogonal polynomials mentioned in Section 9.3, it is possible to find a 


9 If we have a finite set, as with vectors, the summation is over the number of linearly independent 
members of the set. 

10 Many authors use the term closed here. 
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polynomial expansion of each power of z, 

n 

z n = Y, OiPi(z), (9.64) 

i =0 

where Pi(z) is the ith polynomial. Exercises 11.4.6, 13.1.8, and 13.2.5 are spe¬ 
cific examples of Eq. (9.64). Using Eq. (9.64), we may reexpress the Laurent 
expansion of f(z ) in terms of the polynomials, showing that the polynomial 
expansion exists (when it exists, it is unique; Exercise 9.4.1). The limitation 
of this Laurent series development is that it requires the function to be ana¬ 
lytic. Equations (9.62) and (9.63) are more general. F(x) may be only piece- 
wise continuous. Numerous examples of the representation of such piecewise 
continuous functions appear in Chapter 14 (Fourier series). A proof that our 
Sturm-Liouville eigenfunctions form complete sets appears in Courant and 
Hilbert. 11 

In Eq. (9.62) the expansion coefficients a m may be determined by 

a m = F(x)(p m (x)w(x)dx = {<p m \F). (9.65) 

J a 

This follows from multiplying Eq. (9.62) by <p m (x)w(x) and integrating. In Dirac 
notation, 

I F) = 'y a n \<p n ) implies {(Pm\F) — y ' ci n ((p m \(p n ) — y ' ci n 8 mn = o m 

n n n 

provided the | <p n ) are normalized to unity. From the orthogonality of the eigen¬ 
functions, cp n (x ), only the mth term survives. Here, we see the value of orthog¬ 
onality. Equation (9.65) may be compared with the dot or inner product of 
vectors (Section 1.2) and a m interpreted as the mth projection of the function 
F(x). Often, the coefficient a m is called a generalized Fourier coefficient. 

For a known function F(,r), Eq. (9.65) gives a m as a definite integral that 
can always be evaluated, by computer if not analytically. 

For examples of particular eigenfunction expansions, see the following: 
Fourier series, Chapter 14; Legendre series, Section 11.3; Laplace series, 
Section 11.6; Bessel and Fourier-Bessel expansions, Section 12.1; Henuite 
series, Section 13.1; and Laguerre series, Section 13.2. An explicit case of a 
Fourier expansion is the square wave (Example 9.2.2). The corresponding 
Hilbert space contains only periodic functions that can be expanded in a series 
of sin nx, cos nx, the eigenfunctions of one-dimensional square-well potentials 
in quantum mechanics under suitable boundary conditions. 

It may also happen that the eigenfunction expansion [Eq. (9.62)] is the ex¬ 
pansion of an unknown F{pc) in a series of known eigenfunctions <p n (x) with 
unknown coefficients a n . An example is the quantum chemist’s attempt to 
describe an (unknown) molecular wave function as a linear combination of 
known atomic wave functions. The unknown coefficients a n would be deter¬ 
mined by a variational technique, Rayleigh-Ritz (Section 18.6). 


11 Courant, K., and Hilbert, D. (1953). Methods of Mathematical Physics (English translation), 
Vol. 1. Interscience, New York. Reprinted, Wiley (1989), Chap. 6, Section 3. 
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Bessel’s Inequality 


If the set of functions <p n (x) does not form a complete set, possibly because we 
have not included the required infinite number of members of an infinite set, we 
are led to an inequality. First, consider the case of a finite sum of components. 
Let A be an n component vector, 


A = eiai + e 2 a 2 H-h e n a n , 


(9.66) 


in which e, form a set of orthonormal unit vectors and a is the corresponding 
component (projection) of A; that is, 


&i — A. ■ . 


(9.67) 


Then 


Y 

A-^eittij >0. (9.68) 


If we sum over all n components, clearly the summation equals A by 
Eq. (9.66) and the equality holds. If, however, the summation does not in¬ 
clude all n components, the inequality results. By expanding Eq. (9.68) and 
remembering that the orthogonal unit vectors satisfy orthogonality relations 


©£ * ©j 


— Sij, 


(9.69) 


we have 


A 2 > J2 a l 


This is Bessel’s inequality. 

For functions we consider the integral 


f 

J a 


-,2 


fix) - ^2 amix) 


i= 1 


w(x)dx > 0. 


(9.70) 


(9.71) 


This is the continuum analog of Eq. (9.68), letting n -» oo and replacing the 
summation by an integration. Again, with the weighting factor w (x) > 0, 
the integrand is nonnegative. The integral vanishes by Eq. (9.62) if we have 
a complete set. Otherwise it is positive. Expanding the squared term, we 
obtain 

C b , n C b ” 

/ [ fjx)] 2 wjx)dx — 2 y, aj / fix)(piix)wix)dx + E a 2 > 0. (9.72) 

Ja i=1 Ja j=l 


Applying Eq. (9.65), we have 
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Hence, the sum of the squares of the expansion coefficients a* is less than or 
equal to the weighted integral of [/Or)] 2 , the equality holding if and only if the 
expansion is exact—that is, if the set of functions <p n (x) is a complete set and 
n —> oo. 

In later chapters, when we consider eigenfunctions that form complete 
sets (such as Legendre polynomials), Eq. (9.73) with the equal sign holding is 
called a Parseval relation. 

Bessel’s inequality has a variety of uses, including proof of convergence of 
the Fourier series. 


Schwarz Inequality 


The frequently used Schwarz inequality is similar to the Bessel inequality. 
Consider the quadratic equation with unknown x: 


n n / ,\2 

'Y^id i x + bif — a 2 (x + — J = 0. 
i= i i= i V ai ' 


(9.74) 


If bi/di = constant c, then the solution is x — —c. If b;/cij is not a constant, 
all terms cannot vanish simultaneously for real x. Therefore, the solution must 
be complex. Expanding, we find that 


^J2 a i +2x J2 aibi + Y^ b i = °> (9.75) 

i= 1 1=1 i= 1 

and since x is complex (or = — bi/ai), the quadratic formula 12 for x leads to 

( n \ ^ / n \ ( n \ 

J2 aib j ( 9 - 76 ) 

the equality holding when bi/di equals a constant. 

Once more, in terms of vectors, we have 


(a • b) 2 = d 2 b 2 cos 2 9 < d 2 b 2 , 


(9.77) 


where 6 is the angle included between a and b. 

The analogous Schwarz inequality for functions has the form 


I / 


f* 0*0 g (%) w ( x ) dx 


<-f 


: f 

J a 


f*(x')f(x')w(x')dx / g*(x')g(x')w(x')dx, (9.78) 


the equality holding if and only if g(x) — af{pc), with a being a constant. 
To prove this function form of the Schwarz inequality, 13 consider a complex 
function fix) = fix) + /.gix) with X a complex constant. The functions fix) 
and gix) are any two functions (for which the integrals exist). Multiplying by 


12 With discriminant b 2 — 4ac negative (or zero). 

13 An alternate derivation is provided by the inequality / flf(x)g(y) — f(y)g(x)]*[f(x)g(y) — 
f(y)g(x)]w(x)w(y)dxdy > 0. 
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the complex conjugate and integrating, we obtain 

fib fib pb fib 

/ \lr*\lfw(x)dx = / f*fw(x)dx + \ / f*gw(x)dx + X* / g*fw(x)dx 
J a J a J a J a 

+ XX* f g*gw(x)dx>0. (9.79) 

J a 

The >0 appears since i//*i// is nonnegative, the equal (=) sign holding only 
if i J/(pc) is identically zero. Noting that X and X* are linearly independent, we 
differentiate with respect to one of them and set the derivative equal to zero 
to minimize f'f \[r*x[rdx: 


9 

~dX* 



fb 

g*fw(x)dx+ X / g*gw(x)dx= 0. 


This yields 


fg g* fw(x)dx 
fa g*gw(x)dx 


(9.80a) 


Taking the complex conjugate, we obtain 


* = fg f*gw(x)dX 
fa g*gw(x)dx 


(9.80b) 


Substituting these values of X and X* back into Eq. (9.79), we obtain Eq. (9.78), 
the Schwarz inequality. 

In quantum mechanics f(pc) and g(x) might each represent a state or con¬ 
figuration of a physical system. Then the Schwarz inequality guarantees that 
the inner product f f * (x)g (.%') w (.%') dx exists. In some texts, the Schwarz in¬ 
equality is a key step in the derivation of the Heisenberg uncertainty principle. 

The function notation of Eqs. (9.78) and (9.79) is relatively cumbersome. 
In advanced mathematical physics, and especially in quantum mechanics, it is 
common to use the Dirac bra-ket notation: 

(. f\9) = f f*(x)g(x)w(x)dx. 


Using this new notation, we simply understand the range of integration, (a, b ), 
and any weighting function. In this notation the Schwarz inequality becomes 


\(f\g)\ 2 < (f\f)(g\g)- (9.78a) 


If g(x) is a normalized eigenfunction, <pj(x), Eq. (9.78) yields 

«*ai < f /*(;r)/(:r)w(£r)e£a;, (9.81) 


a result that also follows from Eq. (9.73). 
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Summary of Vector Spaces—Completeness 


Here we summarize some properties of vector space, first with the vectors 
taken to be the familiar real vectors of Chapter 1 and then with the vectors 
taken to be ordinary functions—polynomials. The concept of completeness 
is developed for finite vector spaces and carried over into infinite vector 
spaces. 


lv. We shall describe our vector space with a set of n linearly independent 
vectors e,, i = 1, 2,..., n. If n = 3, ei = x, e 2 = y, and e 3 = z. The n e* 
span the linear vector space and are defined to be a basis. 

If. We shall describe our vector (function) space with a set of n linearly inde¬ 
pendent functions, (pfx), i = 0,1,..., n — 1. The index i starts with 0 to 
agree with the labeling of the classical polynomials. Here, (pfx ) is assumed 
to be a polynomial of degree i. The n <pi(x ) span the linear vector (function) 
space forming a basis. 

2v. The vectors in our vector space satisfy the following relations 
(Section 1.2; the vector components are numbers): 
a. Vector addition is commutative u + v = v + u 


- v] + w = u - 

V = V 


[v + w] 


av 
- b u 


b. Vector addition is associative [u 

c. There is a null vector 0 -+ 

d. Multiplication by a scalar 

Distributive a[u + v] = au 

Distributive (a + 6) u = au 

Associative a[6u] = (ah') u 

e. Multiplication 

By unit scalar lu = u 

By zero Ou = 0 

f. Negative vector (— l)u = —u. 

2f. The functions in our linear function space satisfy the properties listed for 
vectors (substitute “function” for “vector”): 


a. f{x) + gQx) = g{x) + fix) 

b. [fix) + gix)] + hix) = fix) + [gQx) + h(x)} 

c. 0 + fix) = fix) 

d. a[fix) + gix)] = a fix) + agQx) 

(a + b)fix) = a fix) + bfix) 
a[bfix)] = iab)fix) 

e. 1 ■ fix) = fix) 

0 • fix) = 0 

f. (-1) • fix) = -fix). 
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3v. In //-dimensional vector space an arbitrary vector c is described by its re 
components (ci, C 2 ,..., c n ) or 

n 

C = Ci = e,; • C. 

i=l 

When (i) re e, are linearly independent and (ii) span the //-dimensional 
vector space, then the e, form a basis and constitute a complete set. 

3f. In //-dimensional function space a polynomial of degree m < re — 1 is 
described by 


n— 1 


f(x) = C ^iW> 

i=0 


(Pi I/) 
(PilPi)’ 


When (i) the n<pi(x ) are linearly independent and (ii) span the re- 
dimensional function space, then the (pi(x) form a basis and constitute 
a complete set (for describing polynomials of degree m < n — 1). 

4v. An inner product (scalar, dot product) is defined by 


n 

c-d = j2 Cidi - 

i=1 


If c and d have complex components, the inner product is defined as 
Eh c*ck- The inner product has the properties of 

a. Distributive law of addition c(d + e) = cd + ce 

b. Scalar multiplication c ad = ac ■ d 

c. Complex conjugation c d = (d c)*. 

4f. An inner product is defined by 

(f\9) = f f*(x)g(x)w{pc)dx. 

J a 

The choice of the weighting frmction w (x) and the interval (a, b ) 
follows from the differential equation satisfied by (pj(x) and the bound¬ 
ary conditions (Section 9.1). In matrix terminology (Section 3.2), |</) is a 
column vector and (/1 is a row vector, the adjoint of |/>. 

The inner product has the properties listed for vectors: 

a. (f\g + h) = (f\g) + (f\h) 

b. {f\ag) = a(f\g) 

c. (f\g) = (g\f)*. 

5v. Orthogonality: 


e i ■ e i — 0) ^ 7 ^ J- 

If the re e,; are not already orthogonal, the Gram-Schmidt process may be 
used to create an orthogonal set. 

5f. Orthogonality: 

(PilPj) = [ (p*(x)(pj(x)w(x)dx = 0, i ^ j. 

J a 

If the re (p;(x) are not already orthogonal, the Gram-Schmidt process (Sec¬ 
tion 9.3) may be used to create an orthogonal set. 
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6v. Definition of norm: 

/» \ 1/2 

I c | = (c-c) 1/2 = (£c?j • 

The basis vectors e,: are taken to have unit norm (length) ej ■ e,: = 1. The 
components of c are given by 

Ci = e* • c, i = 1, 2,..., n. 


6f. Definition of norm: 



f b 

1/2 

n— 1 

ll/ll = (f\f) 1/2 = 

/ \fix)\ 2 wix)dx 

a 

= 

£n 2 

_i=0 


Parseval’s identity: ||/|| > 0 unless fix) is identically zero. The basis func¬ 
tions q)j(x) may be taken to have unit norm (unit normalization) 


M = 1. 


Note that Legendre polynomials are not normalized to unity. 

The expansion coefficients of our polynomial f{pc) are given by 

Ci = ((Pi\f), i = 0, 1, 1. 

7v. Bessel’s inequality: 

c-c> J2 c l 

i 

If the equal sign holds for all c, it indicates that the e, span the vector space; 
that is, they are complete. 

7f. Bessel’s inequality: 

rb 

(f\f) = / \fix)\ 2 w(x)dx > N 2 - 

Ja i 

If the equal sign holds for all allowable /, it indicates that the (pi(x') span 
the function space; that is, they are complete. 

8v. Schwarz inequality: 

c d < |c| |d|. 

The equal sign holds when c is a multiple of d. If the angle included between 
c and d is 0, then | cos 0 \ < 1. 

8f. Schwarz inequality: 

\(f\g)\ < (f\f) 1/2 (g\g) 1/2 = \\f\\ ■ \\g\\. 

The equals sign holds when fix ) and (j(x) are linearly dependent; that is, 
when fix ) is a multiple of g(x). 

9v. Now, let n —> oo, forming an infinite-dimensional linear vector space, l 2 . 
In an infinite-dimensional space our vector c is 

OO 

c = y ' Ci^i. 
i= 1 
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We require that 

OO 

< °°' 

1=1 

The components of c are given by 

d = e, c, i— 1,2,..., oo, 

exactly as in a finite-dimensional vector space. 

9f. Then let n —> oo, forming an infinite-dimensional vector (function) space, 
L 2 . Then the superscript 2 stands for the quadratic norm [i.e., the 2 in 
|/(r)| 2 ]. Our functions need no longer be polynomials, but we do require 
that f{pc) be at least piecewise continuous (Dirichlet conditions for Fourier 
series) and that {/|/> = f b f(x)\ 2 w(x)dx exist. This latter condition is 
often stated as a requirement that f(x) be square integrable. 

Cauchy sequence: Let 

n 

fn(x) = ^ CiViipc). 
i =0 

If 


II fipc) - f n (x )II -* 0 as n -* oo 


or 


/ | n 

f(x) - 
I i=0 


C;,(pjX 


w(x~)dx = 0, 


then we have convergence in the mean. This is analogous to the partial 
sum-Cauchy sequence criterion for the convergence of an infinite series 
(Section 5.1). 

If every Cauchy sequence of allowable vectors (square integrable, piece- 
wise continuous functions) converges to a limit vector in our linear space, 
the space is said to be complete. Then 

OO 

f(x) = ^ C;(pi(x) (almost everywhere) 
i=o 


in the sense of convergence in the mean. As noted previously, this is a 
weaker requirement than pointwise convergence (fixed value of x ) or uni¬ 
form convergence. 


Expansion (Fourier) Coefficients 

For a function / its Fourier coefficients are defined as 
Ci = ((pi\f), i = 0,1,..., oo, 
exactly as in a finite-dimensional vector space. Hence, 


f(x) = X^il/^iOT)- 

i 
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SUMMARY 


A linear space (finite- or infinite-dimensional) that (i) has an inner product 
defined, (/ | g), and (ii) is complete is a Hilbert space. 

Infinite-dimensional Hilbert space provides a natural mathematical frame¬ 
work for modern quantum mechanics because bound-state wave functions 
are normalized (square-integrable) and usually are eigenfunctions of some 
Hamiltonian that provides a basis of the Hilbert space. A physical state may 
be expanded in a set of basis vectors, which are eigenstates of some observ¬ 
able. The expansion coefficients squared give the probabilities of the different 
eigenvalues of the observable in the given state. Apart from quantum mechan¬ 
ics, Hilbert space retains its abstract mathematical power and beauty, but the 
necessity for its use is reduced. 

The Sturm-Liouville theory of second-order ODEs with boundary conditions 
leads to eigenvalue problems whose solutions are eigenfunctions with orthog¬ 
onality properties. Special functions, such as Legendre polynomials, Bessel 
functions, and Laguerre polynomials, arise in this context. Eigenfunction ex¬ 
pansions are important in quantum mechanics and many other areas of physics 
and engineering. 


Biographical Data 

Hilbert, David. Hilbert, a German mathematician, was born in 1862 in 
Konigsberg, Prussia (now Russia), and died in 1943 in Gottingen, Germany. 
Son of a judge, he obtained his Ph.D. in mathematics at the University of 
Konigsberg in 1885 and became a professor at Gottingen in 1895. In 1899, 
in his Foundations of Geometry he established the first consistent set of 
geometric axioms, which helped the axiomatic method for the foundation 
of mathematics to gain general recognition. He contributed to most active 
branches of mathematics and, with Poincare, is considered one of the great¬ 
est mathematicians of the 20th century. He solved Waring’s problem in num¬ 
ber theory, developed solutions for integral equations, and is famous for an 
influential list of unsolved mathematics problems presented in 1900 at the 
International Congress of Mathematicians in Paris, which deeply influenced 
the development of mathematics in the 20th century. 


EXERCISES 

9.4.1 A function f{pc) is expanded in a series of orthonormal eigenfunctions 

OO 

fix) = UnVnix)- 

71=0 

Show that the series expansion is unique for a given set of (p n (x). The 
functions <p n (x) are being taken here as the basis vectors in an infinite¬ 
dimensional Hilbert space. 
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9.4.2 A function /( x ) is represented by a finite set of basis functions <pi(x), 

N 

/G*D = Y wf*)' 

i=l 

Show that the components c* are unique; that no different set c\ exists. 
Note. Your basis functions are automatically linearly independent. They 
are not necessarily orthogonal. 

9.4.3 A function /(a;) is approximated by a power series AA' 1 over the 
interval [0, 1]. Show that minimizing the mean square error leads to a 
set of linear equations 


Ac = b, 


where 


and 


Ay = f x l+j dx = 

Jo 


i + j+l 


, i, j = 0,1, 2,..., n — 1 


Jo 


bi= x l f(x)dx, i = 0, 1, 2, ..., n— 1. 

' o 


Note. The Ay are the elements of the Hilbert matrix of order n. The 
determinant of this Hilbert matrix is a rapidly decreasing function of 
n. For n = 5, det A = 3.7 x 10~ 12 and the set of equations Ac = b is 
becoming ill conditioned and unstable. 


9.4.4 In place of the expansion of a function F(x) given by 


OO 

F(x) = Y a n <Pn&), 

n =0 


with 


r b 

a n = F (x) cp n (x) w (x) dx, 


take the finite series approximation 


m 

F(X) » Y C nVn(x). 
n =0 


Show that the mean square error 



F(x) - Y C nVn{x) 

n =0 


w(x)dx 


is minimized by taking c n = a n . 

Note. The values of the coefficients are independent of the number 
of terms in the finite series. This independence is a consequence of 
orthogonality and would not hold for a least-squares fit using powers 
of x. 
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9.4.5 From Example 9.2.2, 


h/2, 0 <x< 7 t]_ 2 h sin(2n + l)x 

-h/2, -n < x < Q \ ~ n 2n+ 1 


(a) Show that 

r 9 n , ah 2 ^ o 

/ [/(x)]"dx = -h = — ^(2n + 1) 2 . 

J - 77 ^ 7T r, 

^ 74 r>—\\ 


For a finite upper limit, this would be Bessel’s inequality. For the 
upper limit, oo, this is Parseval’s identity. 

(b) Verify that 



4 h 2 

Tt 


OO 

+ I )" 2 

n =0 


by evaluating the series. 

Hint. The series can be expressed as the Riemann zeta function. 


9.4.6 Differentiate Eq. (9.79), 

(W) = (f\f) + Mf\g) + A. *{g\f) + M*(g\g), 


with respect to k* and show that you get the Schwarz inequality 
[Eq. (9.78)]. 


9.4.7 Derive the Schwarz inequality from the identity 


f 


fipc^gQx^dx 


= f lf(x)] 2 dx ( [g(x)] 2 dx 
J a J a 


1 

2 


ff 


[f(x)g(y) 


f(y)g(x)?dx dy. 


9.4.8 If the functions f(x) and g (x) of the Schwarz inequality [Eq. (9.78)] may 
be expanded in a series of eigenfunctions <Pi(x), show that Eq. (9.78) 
reduces to Eq. (9.76) (with n possibly infinite). 

Note the description of f{pc) as a vector in a function space in which 
(Piix) corresponds to the unit vector ei. 


9.4.9 The operator II is Hermitian and positive definite; that is, 


f 

J a 


f*Hf dx > 0. 


Prove the generalized Schwarz inequality: 


I f 


f*Hg dx 


I f* Hf dx ( g*Hg dx. 
J a J a 


9.4.10 A normalized wave function i// (x) = a„(p„(x). The expansion 

coefficients a n are known as probability amplitudes. We may define 
a density matrix p with elements p L j = a.;,a*. Show that 


(P )ij ~ Pij 
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or 


= P- 


This result, by definition, makes p a projection operator. 
Hint Use 

J f*fdx = 1. 

9.4.11 Show that 

(a) the operator 

Wi)Wi\ 

operating on 


/CO = Y^Cj{t\(pj) 

3 

yields I <£>*$)), {t\<pj) = <pj(t ). 

(b) I^X^I = L 

i 

(c) For (aj<p p ) = e ipoc /y/2n derive from (a) the Fourier integral of the 
function / (t) and from (b) the Fourier integral representation of 
Dirac’s <5 function (see Chapter 1). Note that p is a continuous 
variable (momentum) replacing the discrete index i. 


This operator is a projection operator projecting f(x) onto the 
v'th coordinate, selectively picking out the ith component Cj\<pi) of 
/(«)• 

Hint. The operator operates via the well-defined inner product. In the 
coordinate representation you actually have to work with (x\ (pi) (<pi 1 1) = 


Additional Reading 


Byron, F. W., Jr., and Fuller, R. W. (1969). Mathematics of Classical and Quan¬ 
tum Physics. Addison-Wesley, Reading, MA. 

Dennery, P., and Krzywicki, A. (1996). Mathematics for Physicists (reprinted). 
Dover, New York. 

Hirsch, M. (1974). Differential Equations, Dynamical Systems, and Linear 
Algebra. Academic Press, San Diego. 

Miller, K. S. (1963). Linear Differential Equations in the Real Domain. Nor¬ 
ton, New York. 

Titchmarsh, E. C. (1962). Eigenfunction Expansions Associated with Second 
Order Differential Equations, 2nd ed., Vol. 1. Oxford Univ. Press, London. 




The Gamma Function 
(Factorial Function) 


The gamma function appears in physical problems of all kinds, such as the 
normalization of Coulomb wave functions and the computation of probabilities 
in statistical mechanics. Its importance stems from its usefulness in developing 
other functions that have direct physical application. The gamma function, 
therefore, is included here. A discussion of the numerical evaluation of the 
gamma function appears in Section 10.3. Closely related functions, such as the 
error integral, are presented in Section 10.4. 


10.1 Definitions and Simple Properties 


At least three different, convenient definitions of the gamma function are in 
common use. Our first task is to state these definitions, to develop some simple, 
direct consequences, and to show the equivalence of the three forms. 


In fin ite Limit (Euler) 

The first definition, due to Euler, is 


r(«) = lim 


1-2-3 ■■■n 


°° z(z + l)(s + 2) • • ■ (z + n) 


n\ z^O, —1, -2, -3,.... (10.1) 


This definition of I' (z) is useful in developing the Weierstrass infinite-product 
form of I’ (z) [Eq. (10.17)] and in obtaining the derivative of lnr(s) (Sec¬ 
tion 10.2). Here and elsewhere in this chapter, z may be either real or complex. 
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Replacing z with z+ 1, we have 


r(a+ 1) = lim 


1-2-3 -n 


-n 


2+1 


= lim 


00 (z + l)(s + 2)(s + 3) - - - (z + n + 1) 
nz 1-2-3 ■ ■ - n 


>ooz + n+\ z(z+ l)(s+ 2) • ■ ■ (a + n) 
= zT{z). 


n 


( 10 . 2 ) 


This is the basic functional relation for the gamma function. It should be noted 
that it is a difference equation. The gamma function is one of a general class 
of functions that do not satisfy any differential equation with rational coef¬ 
ficients. Specifically, the gamma function is one of the very few functions of 
mathematical physics that does not satisfy any of the ordinary differential 
equations (ODEs) common to physics. In fact, it does not satisfy any useful or 
practical differential equation. 

Also, from the definition 


T(l) = lim 


1 • 2 ■ 3 ■ ■ • n 


n-^oo 1 ■ 2 • 3 • • • n(n + 1) 


-n= 1. 


Now, application of Eq. (10.2) gives 


(10.3) 


r( 2 ) = l, 

T(3) = 2T(2) = 2,... (10.4) 

r (n)= 1 • 2 • 3 • • • (n — 1) = (n — 1)! 

We see that the gamma function interpolates the factorials by a continuous 
function that returns the factorials at integer arguments. 


Definite Integral (Euler) 


A second definition, also frequently called the Euler integral, is 


T( 2 )= / e~ t t z ~ 1 dt, 91(2) >0. (10.5) 

Jo 

The restriction on z is necessary to avoid divergence of the integral at t. = 0. 
When the gamma function does appear in physical problems, it is often in this 
form or some variation, such as 


or 


pOO 

T(2) = 2 / e~ t2 i 2z - l dt , 
Jo 



9i(s) > 0 


( 10 . 6 ) 


dt, 91(2;) > 0. 
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EXAMPLE 10.1.1 


The Euler Integral Interpolates the Factorials The Euler integral for 
positive integer argument, z = n + 1 with n > 0, yields the factorials. LTsing 
integration by parts repeatedly we find 


r 

Jo 


e t t n dt = -t n e 1 


OO poo 


= n 


0 

+n—l„—t I 


n f e~ l t n ~ l dt = n f e~ l t n ~ L dt 
Jo Jo 


t+n— 1. 


poo 

—t n ~ l e~ l \ +(n— 1) / e~H n ~ 2 dt 
lo Jo 


'L 


= n(n — 1) / e l t n 2 dt = ■ • • = n\ 


because / 0 °° e l dt = 1. Thus, the Euler integral also interpolates the factorials. 


When z = Eq. (10.6) contains the Gauss error integral, and we have the 
interesting result 


r(i) = V^. 


(10.7) 


The value T(g) can be derived directly from the square of Eq. (10.6) for z = \ 
by introducing plane polar coordinates ( x 2 + y 2 = p 2 , dxdy — p dp dtp) in the 
product of integrals 


r| 1 1 =4 


p OO pc 

Jo Jo 


-x“-y‘ 


dxdy 


)0 JO 

pn/2 poo 

= 4/ 1 e~ p ~ pdpdip =—7re~ p ~ 

J cp= 0 J p—0 

Generalizations of Eq. (10.7), the Gaussian integrals, are 


= 71. 


f 


x 2s+l exp(— ax 2 ) dx = 


s\ 


2a s+1 ’ 


f 




( 10 . 8 ) 

(10.9) 


which are of major importance in statistical mechanics. A proof is left to the 
reader in Exercise 10.1.11. The double factorial notation is explained in Exer¬ 
cise 5.2.12 and Eq. (10.33b). 

To show the equivalence of the two definitions, Eqs. (10.1) and (10.5), 
consider the function of two variables 

F(a,w) = jf t z ~ l dt, m(z)>0, (10.10) 


with n a positive integer. Since 


lim 

n—>oo 



= e 


( 10 . 11 ) 
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from the definition of the exponential 


-jf 


lim F(z, ri) = F(z, oo ) = I e l t z 1 cU = r(^) 


(10.12) 


by Eq. (10.5). 

Returning to F(z, ri), we evaluate it in successive integrations by parts. For 
convenience let u = t/n. Then 


F(z, ri) = ri 


A- 


ri) n u z 1 du. 


Integrating by parts, we obtain for ;H(z) > 0, 


= (!-»)*- 
n z z 


+ - I (1 - u) n ~ l 

0 Z Jo 


u z du. 


(10.13) 


(10.14) 


Repeating this, with the integrated part vanishing at both end points each 
time, we finally get 


F(z, ri) = n 2 


n(n — 1) • • • 1 


z(z+ 1) ■ ■ • (z + to — 1) 
1 • 2•3-• -to 


f 


u 


,z+n— 1 


du 


-n . 


(10.15) 


z(z + l)(s + 2) • • • (z + ri) 

This is identical to the expression on the right side of Eq. (10.1). Hence, 

lim F(z, ri) = F(z, oo) = I ’ (z) (10.16) 

n-^-oo 

by Eq. (10.1), completing the proof. 

Using the functional equation (10.2) we can extend \'(z) from positive to 
negative arguments. For example, starting with Eq. (10.2) we define 


T 


Jl— V?. 


and Eq. (10.2) for z -> 0 implies that T(s -> 0) -» oo. Moreover, zr(z) —>■ 1 for 
z —> 0 [Eq. (10.2)] shows that T(s) has a simple pole at the origin. Similarly, 
we find simple poles of r(z) at all negative integers. 

Infinite Product (Weierstrass) 

The third definition (Weierstrass’s form) is 


1 


= ze 


yz 


n> 


- )e 
n 


-z/n 


E(2) 

v J n =1 

where y is the Euler-Mascheroni constant [Eq. (5.27)] 


y 


= lim ( V - - Inn) = 0.5772156- • 

n^oo \ ; m / 

\m =1 / 


(10.17) 


(10.18) 
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This infinite product form may be used to develop the reflection identity, 
Eq. (10.24a), and applied in the exercises, such as Exercise 10.1.17. This form 
can be derived from the original definition [Eq. (10.1)] by rewriting it as 


F(s) = lint 

n—oo 


1 • 2 • 3 • • • n 
-n' 

z(z + 1) ■ • ■ (z + n ) 



(10.19) 


Inverting Eq. (10.19) and using 


n z = e~ zlnn , 


( 10 . 20 ) 


we obtain 


1 

r(z) 


= z lint e ( - tam)z 

n-^-oo 


n 

no+ 

m =1 



Multiplying and dividing by 


exp 



1 

2 


1 

3 



n 


n<>" 

m=l 


( 10 . 21 ) 


( 10 . 22 ) 


we get 


1 

W) 


= z 


lim exp 

n-^-oo 




X 





(10.23) 


As shown in Section 5.2, the infinite series in the exponent converges and 
defines y, the Euler-Mascheroni constant. Hence, Eq. (10.17) follows. 

The Weierstrass infinite product definition of r(s) leads directly to an 
important identity, 

r(s)r(l -z)= —, (10.24a) 

sin zjx 

using the infinite product formulas for \'(z), T(1 — z), and sins [Eq. (7.60)]. 
Alternatively, we can start from the product of Euler integrals 


r(s+i)r(i- 



•f 


s z e S ds I t z e l dt 


dv 


(v + l) 2 


r 

Jo 


e U udu = 


nz 

sin7rs’ 


transfonning from the variables s,t to u = s + t, v = s/t, as suggested by 
combining the exponentials and the powers in the integrands. The Jacobian is 


J 


1 1 _ s + t _ (v + l) 2 

1 s — To = 

7 -7J f 2 U 


where (r + l)t, = u. The integral / 0 °° e U udu = 1, whereas that over v may 
be derived by contour integration, giving Similarly, one can establish 
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Legendre’s duplication formula 


r(i + 2)r(s+ i) = 2- 2 V^r(2s+1). 

Setting z = \ in Eq. (10.24a), we obtain 

r(i) = VSF, 


(10.24b) 


(taking the positive square root) in agreement with Eqs. (10.7) and (10.9). 

The Weierstrass definition shows immediately that I ’ (z) has simple poles 
at z = 0, — 1, —2, — 3 , ..., and that [r («)] _1 has no poles in the finite complex 
plane, which means that V(z) has no zeros. This behavior may also be seen in 
Eq. (10.24a), in which we note that :r/(sin ns) is never equal to zero. 

Actually, the infinite product definition of I'( 2 ) may be derived from the 
Weierstrass factorization theorem with the specification that [T^)]” 1 have 

simple zeros at z = 0, — 1, —2, —3,_The Euler-Mascheroni constant is fixed 

by requiring T(l) = 1. See also the product expansions of entire functions in 
Chapter 7. 

In mathematical probability theory the gamma distribution (probability 
density) is given by 


/(*) = 


1 


-x a 1 e x/ P, x > 0 


/3 a r (a) 

0, x < 0. 


(10.25a) 


The constant [/f“r(a )] _1 is included so that the total (integrated) probability 
will be unity. For x -> E, kinetic energy, a —> f and ft —> kT, Eq. (10.25a) 
yields the classical Maxwell-Boltzmann statistics. 


Biographical Data 

Weierstrass, Karl Theodor Wilhelm. Weierstrass, a German mathemati¬ 
cian, was bom in 1815 in Ostenfelde, Germany, and died in 1897 in Berlin. 
He obtained a degree in mathematics in 1841 and studied Abel’s and Jacobi’s 
work on elliptical functions and extended their work on analytic functions 
while living as a schoolteacher. After being recognized as the father of mod¬ 
em analysis, he became a professor at the University of Berlin and a member 
of the Academy of Sciences in 1856. 


Factorial Notation 


So far, this discussion has been presented in terms of the classical notation. 
As pointed out by Jeffreys and others, the —1 of the z — 1 exponent in our 
second definition [Eq. (10.5)] is a continual nuisance. Accordingly, Eq. (10.5) 
is rewritten as 



z\, 91(2) > 1, 


(10.25b) 
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to define a factorial function z\. Occasionally, we may even encounter Gauss’s 
notation, P[(a'), for the factorial function 

r>) = *!• (10-26) 

The T notation is due to Legendre. The factorial function of Eq. (10.25a) is, 
of course, related to the gamma function by 

r(.s)= (s-1)!, or T( 0 +l) = s!. (10.27) 

If z — n, a positive integer [Eq. (10.4)] shows that 

z\ = n\ = 1 • 2 • 3 • • • n, (10.28) 


the familiar factorial. However, it should be noted that since z\ is now defined 
by Eq. (10.25b) [or equivalently by Eq. (10.27)] the factorial function is no 
longer limited to positive integral values of the argument (Fig. 10.1). 
The difference relation [Eq. (10.2)] becomes 


y\ 

(*-!)!= -■ 
z 

This shows immediately that 


(10.29) 


0 ! = 1 


(10.30) 


Figure 10.1 

The Factorial 
Function—Extension 
to Negative 
Arguments 


x\ 
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Figure 10.2 

The Factorial 
Function and the 
First Two 

Derivatives of ln(.v!) 



and 


n\ = ±oo for n, a negative integer. (10.31) 

In terms of the factorial, Eq. (10.24a) becomes 

»!(-*)! = (10.32) 

sinjr^ 

because 

r(»)r(l -z) = (z- 1)!(1 -z-Y)\ = (z- 1)!(-*)! = -»!(-«)!. 

z 

By restricting ourselves to the real values of the argument, we And that x! 
defines the curve shown in Fig. 10.2. The minimum of the curve in Fig. 10.1 is 

x! = (0.46163 ■••)! = 0.88560 ■ ■ •. (10.33a) 


Double Factorial Notation 


In many problems of mathematical physics, particularly in connection with 
Legendre polynomials (Chapter 11), we encounter products of the odd positive 
integers and products of the even positive integers. For convenience, these are 
given special labels as double factorials: 


1 • 3 ■ 5 • • ■ (2m + 1) = (2m + 1)!! 

2 • 4 • 6 • • • (2m) = (2m)!!. 

Clearly, these are related to the regular factorial functions by 

(2m + 1)! 


(10.33b) 


(2m)!! = 2”m! and (2m + 1)!! = 


2™m! 


(10.33c) 
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Integral Representation 


An integral representation that is useful in developing asymptotic series for 
the Bessel functions is 


/ e~ z z v dz = (e 2jciv - l)v!, (10.34) 

Jc 

where C is the contour shown in Fig. 10.3. This contour integral representation 
is only useful when v is not an integer, z = 0 then being a branch point. 
Equation (10.34) may be verified for v > —1 by deforming the contour as 
shown in Fig. 10.4. The integral from oo to e > 0 in Fig. 10.4 yields — (v!), 
placing the phase of z at 0. The integral from e to oo (in the fourth quadrant) 
then yields e 2 niv v\, the phase of z having increased to 27r. Since the circle 
of radius e around the origin contributes nothing as e — > 0, when v > — 1, 
Eq. (10.34) follows. 

It is often convenient to put this result into a more symmetrical form 

J e~ z {—z) v dz = 2i sin(v7r)r!, (10.35) 

multiplying both sides of Eq. (10.34) by (—l) 1 ' = e~ nlv . 

This analysis establishes Eqs. (10.34) and (10.35) for i> > — 1. It is relatively 
simple to extend the range to include all nonintegral v. First, we note that the 
integral exists for v < —1, as long as we stay away from the origin. Second, 
integrating by parts we find that Eq. (10.35) yields the familiar difference re¬ 
lation [Eq. (10.29)]. If we take the difference relation to define the factorial 
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function of v < — 1, then Eqs. (10.34) and (10.35) are verified for all v (except 
negative integers). 

EXERCISES 

10.1.1 Derive the recurrence relations 

r(z+ 1) = zr(z) 

from the Euler integral [Eq. (10.5)] 



Jo 

10.1.2 In a power series solution for the Legendre functions of the second 
kind, we encounter the expression 

(n + 1 )(n + 2 )(n + 3) • • ■ (n + 2s — 1 )(n + 2s) 

2 ■ 4 • 6 ■ 8 ■ • ■ (2s - 2)(2s) • (2n + 3)(2n + 5)(2n + 7) ■ ■ ■ (2n+ 2s + 1)’ 

in which s is a positive integer. Rewrite this expression in terms of 
factorials. 

10.1.3 Show that 


(s — ri)\ (-1)“- s (2tc-2s)! 


(2s - 2 ri)\ (n - s)! 


where s and n are integers with s <n. This result can be used to avoid 
negative factorials such as in the series representations of the spher¬ 
ical Neumann functions and the Legendre functions of the second 
kind. 

10.1.4 Show that I'( 2 ) may be written 



Jo L V 1 /. 

10.1.5 In a Maxwellian distribution the fraction of particles with speed 
between v and v + dv is 



where N is the total number of particles. The average or expectation 
value of v n is defined as {v n ) = N ~ 1 f v n dN. Show that 
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10.1.6 By transforming the integral into a gamma function, show that 

r i i 

— / x k lnxdx= -—7, k > —1. 

Jo (*+ 1) 2 

10.1.7 Show that 

10.1.8 Show that 

, (ax — 1)! 1 

Inn-= 

x—*o (x — 1)! a 

10.1.9 Locate the poles of V(z). Show that they are simple poles and deter¬ 
mine the residues. 

10.1.10 Show that the equation x! = k, k ^ 0, has an infinite number of real 
roots. 


10.1.11 Show that 


(a) 


(b) 


f 

r 

Jo 


X' 


exp (-ax )dx = 
x 2s exp (—ax 2 ') dx = 


si 


2a s+1 ' 
(s-i)! 


(2s- 1)!! 


2 a s+i/ 2 2 s+1 a s 

These Gaussian integrals are of major importance in statistical 
mechanics. 


10.1.12 (a) Develop recurrence relations for (2n)!! and for (2n + 1)!!. 

(b) Use these recurrence relations to calculate (or define) 0!! and 

(-I)!!- 


ANS. 0!! = 1, (-1)!! = 1. 


10.1.13 For s a nonnegative integer, show that 


(-2s - 1)!! = 


(~l) s 

(2s - 1)!! 


(—l) s 2 s s! 
(2s)! 


10.1.14 Express the coefficient of the nth term of the expansion of (1 + x) 1 ' 2 

(a) in terms of factorials of integers; and 

(b) in terms of the double factorial (!!) functions. 


ANS. a n = (-1) 


n +1 


(2n — 3)! 

2 2n ~ 2 n\(n- 2)! 


. = (-1) 


n +1 


(2n-3)!! 

(2n)!! 


,n= 2,3, 


10.1.15 Express the coefficient of the nth term of the expansion of (1 + x ) 1/2 

(a) in terms of the factorials of integers; and 

(b) in terms of the double factorial (!!) functions. 


ANS. 


a n = (-lT 


(2 n)! 
2 2w (n!) 2 


(- 1 )™ 


(2n— 1)!! 
(2n)!! 


n = 1, 2, 3 • ■ 
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10.1.16 The Legendre polynomial may be written as 


P n (cos6) = 2 


(2 n- 1)!! 


1 n 

cos nO + - ■ -cos(n — 2)0 

1 2n— 1 


+ 


cos(re — 4)0 


(2 »)!! 

1 • 3 n(n — 1) 

1^2(2 n- 1 )(2n — 3) 

1-3-5 n(n — 1)(tc — 2) 

+ 1 ■ 2 • 3 (2 n- l)(2n- 3)(2w- 5) 
Let n = 2s + 1. Then 


cos(to — 6)0 


P„(cos0) = P 2s +i(cos0) = ^ a m cos(2rn+ 1)0. 

m =0 

Find a m in terms of factorials and double factorials. 

10.1.17 (a) Show that 

r (H r (H-<- ir * 

where n is an integer. 

(b) Express T(| +n) and T(| — n) separately in terms of tc x/2 and a 
!! frmction. 


ANS. r [\+n 


= &n- 1)!! 1/2 
2 n 


10.1.18 From one of the definitions of the factorial or gamma function, show 
that 

o nx 
\(ix)'.\- = — 


sinh jxx 


10.1.19 Prove that 


|F(a + t/S)| = |r(a)| \ 


n =0 L 


o2 n-i/2 


i 


(a + ri ) 2 _ 


This equation has been useful in calculations of beta decay theory. 
10.1.20 Show that 

_ ( nb y /2 ^ 


| (n+ iby. 


V sinh it b 

for n, a positive integer. 

10.1.21 Show that 

I ad | > | (x + iy)\ 

for all x. The variables x and y are real. 


Y\(s 2 +b 2 f /2 

' S= 1 


\ + i vV 


7V 


cosh Tty 


10.1.22 Show that 


2 
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10.1.23 The probability density associated with the normal distribution of 
statistics is given by 


/ O) = 


1 

cr(27r) 1 / 2 


exp 


(x - [if 
2er 2 


with (— oo, oo) for the range of x. Show that 

(a) the mean value of x, (x) is equal to p; and 

(b) the standard deviation (( x 2 ) — (a ) 2 ) 1/2 is given by a. 


10.1.24 From the gamma distribution 


/G*0 = 


1 


/3°T (a) 

0, 


^a-lg-a c/fi 


X > 0 
X < 0 , 


show that 

(a) {x} (mean) = a/f, (b) er 2 (variance) = {x 2 ) — {x} 2 = aft 2 . 


10.1.25 The wave function of a particle scattered by a Coulomb potential is 
\j/ (r, 0). At the origin the wave function becomes 


iKO) = e^ra + % y \ 
where y = Z\ Z-±e 2 /hv. Show that 


I CO) | 2 = 


2jry 

e 2n y - 1' 


10.1.26 Derive the contour integral representation of Eq. (10.34) 

(2r)v!sinr;r = / e~ z {-z) v dz. 

Jc 


D 


10.2 Digamma and Polygamma Functions 


Digamma Function 


As may be noted from the definitions in Section 10.1, it is inconvenient to deal 
with the derivatives of the gamma or factorial function directly. Instead, it is 
customary to take the natural logarithm of the factorial function [Eq. (10.1)], 
convert the product to a sum, and then differentiate; that is, we start from 


z\ — zV(z) = lim 

n—M 3 o 


(z+lXz+2y--(z+n) n ' 


(10.36) 


Then, because the logarithm of the limit is equal to the limit of the logarithm, 
we have 


ln(s!) = lim [ln(n!) + 2 Inn — ln(a;+ 1) 

n-^-oo 

— ln(s + 2) — ■ ■ • — ln(s + ri)\. 


(10.37) 
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Differentiating with respect to z, we obtain and define 

- 7 - ln OO = 1 AO+ 1) = Ihn (\nn --j—-—) , (10.38) 

dz n -^00 \ z+1 z+2 z+nj 

which defines i/s(z + 1), the digamma function. From the definition of the 
Euler-Mascheroni constant , 1 Eq. (10.38) may be rewritten as 


Clearly, 


i/(z+ 1) = -y - e 


n =1 


1 


1 


—k + E 


z + n n 
z 


n(n + z) 


i/r(l) = -y = -0.577 215 664 901 •• • . 2 


(10.39) 


(10.40) 


Another, even more useful, expression for 1 [r(z) is derived in Section 10.3. 


P Polyganuna Function 

The digamma function may be differentiated repeatedly, giving rise to the 
polygamma function: 

fjm +1 

^C»> ( * +1 )= __]„(*!) 

OO 1 

= (-l rVE (g + , rH . m= 1,2,3, - (10.41) 

A plot of + 1) and ij/'ix + 1) is included in Fig. 10.2. Since the series in 
Eq. (10.41) defines the Riemann zeta function , 3 when z is set to zero, 

OO 1 

c(m) s y-, (10.42) 

“ n™ 

n =1 

we have 


^(m)(l) = (—1 ) m+ 1 m!f(m+ 1), m= 1, 2, 3,... . (10.43) 

The values of the polygamma functions of positive integral argument, 
may be calculated using Exercise 10.2.6. 

In terms of the more common F notation, 

In r(«) = — *(z) = (10.44a) 


1 Compare Section 5.2, Eq. (5.27). We add and subtract s _1 . 

2 y has been computed to 1271 places by D. E. Knuth, Math. Comput. 16 , 275 (1962) and to 3566 
decimal places by D. W. Sweeney, Math. Comput. 17 , 170 (1963). It may be of interest that the 
fraction 228/395 gives y accurate to six places. 

3 See Chapter 5. For z^ 0 this series may be used to define a generalized zeta function. 
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Maclaurin Expansion, Computation 

It is now possible to write a Maclaurin expansion for ln(s!): 


ln(«!) = y — ^ (ra_1) (l) = -yz + Tc-ir-m (10.44b) 

Z—/ ry^\ / J ryi 


n =1 


n—2 


Z" 

n 


convergent for \z\ < 1; for z = x, the range is — 1 < x < 1. Equation (10.44b) 
is a possible means of computing z\ for real or complex z, but Stirling’s series 
(Section 10.3) is usually better. In addition, an excellent table of values of the 
gamma function for complex arguments based on the use of Stirling’s series 
and the recurrence relation [Eq. (10.29)] is available 4 and can be accessed by 
symbolic software, such as Mathematica, Maple, Mathcad, and Reduce. 



Series Summation 

The digamma and polygamma functions may also be used in summing series. 
If the general term of the series has the form of a rational fraction (with the 
highest power of the index in the numerator at least two less than the highest 
power of the index in the denominator), it may be transformed by the method 
of partial fractions. The infinite series may then be expressed as a finite sum of 
digamma and polygamma functions. The usefulness of this method depends 
on the availability of tables of digamma and polygamma functions. Such tables 
and examples of series summation are given in AMS-55, Chapter 6. 


EXAMPLE 10.2.1 


Catalan’s Constant Catalan’s constant, Exercise 5.2.13, or fi (2), is given by 


k = m = y 

k =0 


(-1/ 
(2k + l) 2 ' 


(10.44c) 


Grouping the positive and negative terms separately and starting with unit 
index [to match the form of i/''*- 2 -’; Eq. (10.41)], we obtain 


OO ^ 1 OO 1 

K = 1 + § (4re+ l) 2 “ 9 “ 5 (4n + 3) 2 ’ 

Now, quoting Eq. (10.41), we get 

K = | + i^ (1) (l +i)~ h^ (1) (l + I)- (10.44d) 

Using the values of from Table 6.1 of AMS-55, we obtain 


K = 0.91596559.... 


Compare this calculation of Catalan’s constant with the calculations of 
Chapter 5, using either direct summation by computer or a modification using 
Riemann zeta functions and then a (shorter) computer code. ■ 


4 Table of the Gamma Function for Complex Arguments, Applied Mathematics Series No. 34. 
National Bureau of Standards, Washington, DC (1954). 
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EXERCISES 


10.2.1 Verify that the following two forms of the digamma function, 

1 


1 fr(x+ 1) = - - Y 


r= 1 


and 


Ux+i ) = E: 


X 


- V, 


“ r(r + x ) 

are equal to each other (for x a positive integer). 
10.2.2 Show that ijr(_z + 1) has the series expansion 


if(z+ 1 ) = —y + Jc-intny 1 . 

n=2 

10.2.3 For a power series expansion of InCa'l), AMS-55 lists 

OO 

ln(s!) = - ln(l + z) + <1 - y) + £](-l) M [f (to) - 1 ]z n / 


n. 


n=2 


(a) Show that this agrees with Eq. (10.44b) for |«| < 1. 

(b) What is the range of convergence of this new expression? 


10.2.4 Show that 


2 \ sin 7r a; / z —\ 

\ / 71 —\ 


% ( 2 w) ~On 
2 n Z ’ 


1*1 < I- 


Hint. Try Eq. (10.32). 


10.2.5 Write out a Weierstrass infinite product definition of zl. Without dif¬ 
ferentiating, show that this leads directly to the Maclaurin expansion 
of ln(«!) [Eq. (10.44b)]. 

10.2.6 Derive the difference relation for the polygamma function 

to ! 


y/ (m) (2+2) = y/™)(2+1) + (-1) 
10.2.7 Show that if 


(z+ l) m+1 ’ 
T(a; + iy) = u + iv 


to = 0 , 1 , 2 , .... 


then 


Y(x— iy) — u— iv. 

This is a special case of the Schwarz reflection principle (Section 6.5). 
10.2.8 The Pochhammer symbol (a),, is defined as 

(a)„ = a(a + 1) • ■ ■ (a + n — 1), (a)o = 1 


(for integral to). 
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(a) Express (a)„ in terms of factorials. 

(b) Find ( d/da)(a) n in terms of (a)„ and digamma functions. 


(c) Show that 


ANS. 


d 

da 


( a) n = (a)„[4(a + ri) - 4(a)]. 


(a)n+k = (a + n\ ■ (a) m . 


10.2.9 Verify the following special values of the i (r fonu of the di- and poly¬ 
gamma functions: 

H 1) = —y> ^ C1) (i) = C(2), ^ C2) (i) = -2f(3). 

10.2.10 Derive the polygamma function recurrence relation 

y/ m) (l + 2 :) = ^ m \z) + {—Y) m m\/z m+x , m= 0, 1, 2,.... 

10.2.11 Verify 

poo 

(a) / e _r lnrdr=—y. 

Jo 

poo 

(b) / re _r lnr dr = 1 — y. 

Jo 

poo poo 

(c) / r n e~ r In?'dr — (n—iy. + n I r re_1 e _r lnrdr, n= 1, 2, 3,_ 

Jo Jo 

//inf. These may be verified by integration by parts, three parts, or 
differentiating the integral form of n! with respect to n. 

10.2.12 Dirac relativistic wave functions for hydrogen involve factors such as 
[2(1 — q^Z 2 ) 1 / 2 ]!, where a, the fine structure constant, is and Z is 
the atomic number. Expand [2(1 — « 2 Z“) 1/2 ]! in a series of powers of 
a 2 Z 2 . 

10.2.13 The quantum mechanical description of a particle in a Coulomb field 
requires a knowledge of the phase of the complex factorial function. 
Determine the phase of (1 + ib)\ for small b. 

10.2.14 The total energy radiated by a black body is given by 

&r/c 4 T 4 f°° x 3 

u= ~Pv~l —i dx - 

Show that the integral in this expression is equal to 3! £(4) [ ((4) = 
7r 4 /90 = 1.0823...]. The final result is the Stefan-Boltzmann law. 

10.2.15 As a generalization of the result in Exercise 10.2.14, show that 

f°° x s dx 

/ =s!?(s + l), 9t(s) > 0. 

Jo e x - 1 

10.2.16 The neutrino energy density (Fermi distribution) in the early history 
of the universe is given by 

4tt x 3 

p v = —^r I - dx. 

h 3 J 0 exp (x/kT) + 1 
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Show that 


Pv 


7tt 5 


(kTf. 


10.2.17 Prove that 

x s dx 

/ “7TT = s! ^ - 2 + !)- *(«) > 0. 

Jo e x + 1 

Exercises 10.2.15and 10.2.17 actually constitute Mellin integral trans¬ 
forms. 


10.2.18 Prove that 

POO 4-Up—Zt 

iA (m) C z) = (-i) n+1 / --- at, m(z) > o. 

Jo 1 - e- 1 

10.2.19 Using di- and polygamma functions sum the series 


(a)£ 


1 


1 




“ n(n +1) „ 

%—l v J n= 2 

Note. You can use Exercise 10.2.6 to calculate the needed digamma 
functions. 


10.2.20 Show that 


OO 


E 

n =1 


i 

(n + a)(n + b ) 


—--{i Ka + 1) - Jr (a + 1)}, 

( b - a) 


a ^ b, and neither a nor & is a negative integer. It is of interest to 
compare this summation with the corresponding integral 



dx 

{x + a){x + b ) 


--{ln(l + 6) - ln(l + a)}. 

b — a 


10.3 Stirling’s Series 


For computation of ln(s!) for very large z (statistical mechanics) and for 
numerical computations at nonintegral values of z, a series expansion of In(a'!) 
in negative powers of z is desirable. Perhaps the most elegant way of deriv¬ 
ing such an expansion is by the method of steepest descents (Section 7.3). 
The following method, starting with a numerical integration formula, does not 
require knowledge of contour integration and is particularly direct. 


The Euler-Maclaurin formula (Section 5.9) for evaluating a definite integral 6 
is 

f{x) dx = i/(0) + /(1) + /(2) + ■ • • + \f{n) 

- bilf'in) - /'(0)] - b 4 [f"(n) - f"'{ 0)] - ■ ■ ■, (10.45) 



Derivation from Euler-Maclaurin Integration Formula 


5 This is obtained by repeated integration by parts. 
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P Stirling’s Series 


in which the b 2n are related to the Bernoulli numbers B 2n by 

(2 ri)\b 2n = B 2n , (10.46) 


Bo = 1 , Bq — 

Bz — 6 > -® 8 = — 30 ’ 

B 4 — — B w = and so on. 
By applying Eq. (10.45) to the elementary definite integral 


f°° dx 1 _ l_ 

J 0 ( 2 + xf z’ X (Z + Xf’ 

(for z not on the negative real axis), we obtain for n —»• oo, 


1 

z 


1 

2 ^ 


+ ^ m (z+ 1 )- 



4!6 4 


(10.47) 


(10.48) 


(10.49) 


This is the reason for using Eq. (10.48). The Euler-Maclaurin evaluation yields 
f(z + 1), which is d 2 In (zY)/dz 2 = J2n=i from E( l- (10-41). 

Using Eq. (10.46) and solving for \p-^\z+ 1), we have 


Vf'(s+ 1) = 


d 

dz 


i Kz + 1) = 


1 

z 


1 


z 




1 

2z? 


+ E 


n=1 


n 

g2n+l 


(10.50) 


Since the Bernoulli numbers diverge strongly, this series does not converge. 
It is a semiconvergent or asymptotic series (Section 5.10), in which the sum 
always has a finite number of terms (compare Section 5.10). 

Integrating once, we get the digamma function 


i fr(_z+ 1) = Ci + lns + 


— Ci In z + 


1 

B 2 

b 4 

2 z 

2z 2 

4 z 4 

1 

OO 

-E 

n=1 

^2 n 

2 a: 

2 nz 2n 


(10.51) 


Integrating Eq. (10.51) with respect to z from z — 1 to z and then letting z 
approach infinity, Ci, the constant of integration may be shown to vanish. This 
gives us a second expression for the digamma function, often more useful than 
Eq. (10.39). 


The indefinite integral of the digamma function [Eq. (10.51)] is 
ln(*0 = ft+(*+!)! „*_ s+ g + ... + _5S_ + ..., (10.52) 
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Figure 10.5 

Accuracy of 
Stirling’s Formula 



in which G> is another constant of integration. To fix C-i, we start from the 
asymptotic formula [Eq. (7.89) from Example 7.3.2] 

zl ~ z z+1/2 e~ z . 

This gives for large enough \z\ 

\n(zl) ~ |ln2;r + (z + 1/2)In z-z, (10.53) 

and comparing with Eq. (10.52), we find that C 2 is 

C 2 = iln2jr, (10.54) 

giving for \z\ —> 00 

ln(s!) = - ln2;r + ( z+ - | In z— z -\— -H--—- — • • •. (10.55) 

y J 2 V 2 ) 12 z 360s 3 1260s 6 v > 

This is Stirling’s series, an asymptotic expansion. The absolute value of the 
error is less than the absolute value of the first term neglected. For large 
enough |s|, the simplest approximation ln(s!) ~ sins— smay be sufficient. 

To help convey a sense of the remarkable precision of Stirling’s series for s!, 
the ratio of the first term of Stirling’s approximation to s! is plotted in Fig. 10.5. 
A tabulation gives the ratio of the first term in the expansion to s! and the ratio 
of the first two terms in the expansion to s! (Table 10.1). The derivation of 
these forms is Exercise 10.3.1. 


Numerical Computation 


The possibility of using the Maclaurin expansion [Eq. (10.44b)] for the numer¬ 
ical evaluation of the factorial function is mentioned in Section 10.2. How¬ 
ever, for large x, Stirling’s series [Eq. (10.55)] gives much more accuracy. The 
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Table 10.1 

Stirling’s Formula 
Compared with Stirling’s 
Series for n = 2 


SUMMARY 


s 

-V2fs s + m e~ s 

s\ 


1 

0.92213 

0.99898 

2 

0.95950 

0.99949 

3 

0.97270 

0.99972 

4 

0.97942 

0.99983 

5 

0.98349 

0.99988 

6 

0.98621 

0.99992 

7 

0.98817 

0.99994 

8 

0.98964 

0.99995 

9 

0.99078 

0.99996 

10 

0.99170 

0.99998 


Table of the Gamma Function for Complex Arguments, Applied Mathematics 
Series No. 34, National Bureau of Standards, is based on the use of Stirling’s 
series for z = x+ iy, 9 < x < 10. Lower values of x are reached with the recur¬ 
rence relation [Eq. (10.29)]. Now suppose the numerical value of x\ is needed 
for some particular value of x in a computer code. How shall we instruct the 
computer to do a:!? Stirling’s series followed by the recurrence relation is a 
good possibility. An even better possibility is to fit x\, 0 < x < 1, by a short 
power series (polynomial) and then calculate x\ directly from this empirical 
fit. Presumably, the computer has been told the values of the coefficients of 
the polynomial. Such polynomial fits have been made by Hastings 6 for various 
accuracy requirements. For example, 


x 


! = 1 + b n x n + e{pc), 


n =1 


(10.56a) 


(10.56b) 


with 

hi = -0.57719 1652 b 5 = -0.75670 4078 
b 2 = 0.98820 5891 b 6 = 0.48219 9394 
b 3 = -0.89705 6937 b 7 = -0.19352 7818 
b 4 = 0.91820 6857 b 8 = 0.03586 8343 

with the magnitude of the error e(.x') < 3 x 10 7 , 0 < x < 1. 

This is not a least-squares fit. Hastings employed a Chebyshev polynomial 
technique to minimize the maximum value of \s(x)\ in Eq. (10.56a). 


The Euler integral 

rOO 

n\= l e~ l t n dt = r(n+1), n— 0,1 ,... 

Jo 

gives the most direct entry to the gamma function. The functional equation 
r(s+ 1) = zT(z~) is its characteristic property that leads to the infinite product 
representation of its inverse, an entire function, and to the pole expansion 


0 Hastings, C., Jr. (1955). Approximations for Digital Computers. Princeton Univ. Press, Prince¬ 
ton, NJ. 
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of its logarithmic derivative, a meromorphic function. The asymptotic expan¬ 
sion, known as Stirling’s series, is widely applied in statistical mechanics and 
leads to similar asymptotic expansions for the error integral and other related 
functions. 


EXERCISES 


10.3.1 Rewrite Stirling’s series to give z\ instead of ln(a'!). 
ANS. z\ = y/2nz z+l/2 e~ z (l -\—— 


139 


12s 288s 2 51840s 3 


10.3.2 Use Stirling’s formula to estimate 52!, the number of possible rear¬ 
rangements of cards in a standard deck of playing cards. 


10.3.3 By integrating Eq. (10.51) from s — 1 to s and then letting s -> oo, 
evaluate the constant C i in the asymptotic series for the digamma 
function \j/ (z + 1). 

10.3.4 Show that the constant C 2 in Stirling’s formula Eq. (10.52) equals 
| In 2tt by using the logarithm of the doubling formula. 

By direct expansion verify the doubling formula for s = to + \ ; n is an 
integer. 


10.3.6 Without using Stirling’s series show that 

/ n +1 pn 

Inxdx, (b) ln(n!) > 1 In.rd.r; nis an integer > 2. 

Notice that the arithmetic mean of these two integrals gives a good 
approximation for Stirling’s series. 


10.3.7 Test for convergence 



CP-g)! 

pi 


2p+l _Y' (2p- l)!!(2p+ 1)!! 
2p + 2~ Jt fr' 0 (2p)!!(2p + 2)!! 


This series arises in an attempt to describe the magnetic field created 
by and enclosed by a current loop. 


10.3.8 Show that 


10.3.9 Show that 


lim x b ~ a 

X—t-OO 


(x + a~)\ 

(x + by. 


= 1 . 


lim 

n-^-oo 


Vn-l-)" }/2 

(2 re)!! 


= Tt 


- 1/2 


10.3.10 Calculate the binomial coefficient ( 2 /) = (see Chapter 5) to six 
significant figures for n = 10, 20, and 30. Check your values by 

(a) a Stirling series approximation through terms in nr 1 ; and 

(b) a double precision calculation. 
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ANS. (“) = 1.84756 x 10 6 , (") = 1.37846 x 10 u , (“) = 1.18264 x 10 17 . 

10.3.1! Truncate the Stirling formula for Inn! so that the error is less than 10% 
forn > 1, <l%forn > 10, and <0.1% for n > 100. 

10.3.12 Derive ^ In r(s) ~ In z — ^ from Stirling’s formula. 



Generalizing the Euler definition of the gamma function [Eq. (10.5)], we define 
the incomplete gamma functions by the variable limit integrals 


f x 

y(a, x)= e~ l t a ~ l dt, 91(a) > 0 

Jo 

(10.57) 

and 


POO 

T (ct, x) = / e~ t t a ~ 1 dt.. 

J X 

(10.58) 

Clearly, the two functions are related because 


y(a, x ) + T(a, x~) = T(a). 

(10.59) 


These functions are useful for the error integrals discussed later. The choice of 
employing y (a, x) or T (a, x) is purely a matter of convenience. If the parameter 
a is a positive integer, Eq. (10.58) may be integrated completely to yield 

/ “ _1 x s \ 

y(n, x) = (n - 1)! ( 1 - e~ x J] — J 

' s=o ■ / (10.60) 

™-l x s 

r(n,x) = (n-l)!e- a V —, n=l,2,.... 

n S! 
s =0 

For nonintegral a, a power series expansion of y (a, x~) for small x and an 
asymptotic expansion of T(a, x) are developed in terms of the error function 
in Section 5.10 [see also Eq. (10.70b)]: 


y(a, x) = x a ^(-1/ 


X " 


n= 0 


n\(a + n)' 


T(a, x) = x a l e x Y t 


(a - 1)! 1 


= x a - l e- x J](-l) 


^ (a — 1 — ri)\ x n 
(n-o)! 1 

-%r 

n= 0 


(-a)! x r 


(10.61) 


These incomplete gamma functions may also be expressed elegantly in terms 
of confluent hypergeometric functions. 
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Figure 10.6 

The Exponential 
Integral, 

Ei(x) = —Ei(— Jtr) 



Exponential Integral 


Although the incomplete gamma function I'(a, x) in its general form [Eq. 
(10.61)] is only infrequently encountered in physical problems, a special case 
is very useful. We define the exponential integral by 7 


r°° e ~ l 

-Ei(-x) = / —dt = E l (x) (10.62) 

Jr t 


(Fig. 10.6). To obtain a series expansion for small x, we start from 


E\(x) = T(0, x) = lim[r(a) — y (a, a;)]. (10.63) 

0 

Caution is needed here because the integral in Eq. (10.62) diverges logarith¬ 
mically as x -> 0. We may split the divergent term in the series expansion for 
y(a, x), 


E\(x) — lim 

a—^0 


ar(a) — x a 


E 

n =1 


(-lj l x n 
n ■ n\ 


Using THopital’s rule (Exercise 5.6.7) and 

— {aT(a)} = — a! = — e ln(a!) = a\\[r (a + 1), 
da da da 

and then Eq. (10.39), 8 we obtain the rapidly converging series 


Ei(x) = — y — \nx— ^ 


n =1 


(-1 ~) n X n 
n■ n\ 


(Fig. 10.6). An asymptotic expansion is given in Section 5.10. 


(10.64a) 


(10.64b) 


(10.65) 


'The appearance of the two minus signs in — Ei(— x) is an historical monstrosity. This integral is 
generally referred to as 
s dx a /da = x a lnx 
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Figure 10.7 

Sine and Cosine 
Integrals 



Further special forms related to the exponential integral are the sine inte¬ 
gral, cosine integral (Fig. 10.7), and logarithmic integral defined by 9 

_ r°° sint 

si(a;) = — - at 

Jx t 

Ci(aO = - / - dt (10.66) 

Jx t 

r du 

li(.r) = / -= Ei(ln.x). 

Jo Intt 

By transforming from real to imaginary argument, we can show that 

si(a() = -J-[Ei(iaO — Ei(— ix)] — ^-[E\(ix) — Ei(—ix)\, (10.67) 

2i 21 

whereas 

11 7 r 

Ci(a;) = -[Ei (ix) + ~E\(-ix)\ = --[E x (ix) + Ei(-ix)\, |arga;| < —. 

Li sLt Li 

( 10 . 68 ) 

Adding these two relations, we obtain 


Ei (ix) = Ci(a;) + i si(a() 


(10.69) 


to show that the relation among these integrals is exactly analogous to that 
among e lx , cos a;, and sin a:. In terms of E x , 


E\ (ix) = — Ci(a:) + i si(a:). 


Asymptotic expansions of Ci(a') and si(a:) may be developed similar to those 
for the error functions in Section 5.10. Power series expansions about the ori¬ 
gin for Ci(a;), si (a), and li(.r) may be obtained from those for the exponential 


9 Another sine integral is given by Si(a) = si(at) + n /2. 
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Figure 10.8 
Error Function erf x 



integral, E\ (x), or by direct integration. The exponential, sine, and cosine inte¬ 
grals are tabulated in AMS-55, Chapter 5, and can also be accessed by symbolic 
packages such as Mathematica, Maple, Mathcad, and Reduce. 


Error Integrals 

The error integrals 

2 C z 2 C°° 

erf z=—= / e~'~dt, erfc z = 1 — erf z = —— / e~'~dt (10.70a) 
V 71 «/0 \ it Jz 

(normalized so that erf oo = 1) are introduced in Section 5.10 (Fig. 10.8). 
Asymptotic forms are developed there. From the general form of the integrands 
and Eq. (10.6) we expect that erf z and erfc z may be written as incomplete 
gamma functions with a = |. The relations are 

erf z — 7r _1/2 y z 2 ), erfc z = 7r -1/2 r(g, z 2 ). (10.70b) 

The power series expansion of erf z follows directly from Eq. (10.61). 


EXERCISES 
10.4.1 Show that 


(a) by repeatedly integrating by parts; and 

(b) demonstrate this relation by transforming it into Eq. (10.61). 


10.4.2 Show that 
cF 


(a) j—[ x - a y{a, x)] = (-1 rar“->(a + to, x), 
^ cT r x f m * r (°0 

(b) - [e y(a, x)] = e — -- 

dx m T(a — to) 


y(a — m, x). 
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10.4.3 Show that y(a, x) and T(a, x) satisfy the recurrence relations 

(a) y(a + 1, x) = ay (a, x ) — x a e~ x , 

(b) T(a — 1, x) = ar(a, x) + x a e~ x . 


10.4.4 The potential produced by a Is hydrogen electron (Exercise 10.4.11) 
is given by 


V(f) = 


Q 

insoao 


-y(3, 2r) + r(2, 2r) 


(a) For r 1, show that 


V(r) = 

4iteoao 



(b) For r 1, show that 

q 1 

V(r) = —*-. 

47reotto r 

Here, r is a pure number, the number of Bohr radii, oq. 

Note. For computation at intermediate values of r, Eqs. (10.60) are 
convenient. 


10.4.5 The potential of a 2 p hydrogen electron is found to be 


V(f) = 


Q 


Ajteo 24cio 

1 q 


-y(5, r) + T(4, r) 
r 


-y(7,r) + rr(2, r)| P 2 (cos0). 


47re 0 120ao 

Here, r is expressed in units of a-o, the Bohr radius. 7T(cos 0) is a 
Legendre polynomial (Section 11.1). 

(a) For r <$: 1, show that 


V(f) = 


Q_ 

4 7T£ 0 ao 
(b) For r 1, show that 

_ Q 

4jt£o agr 


j - m’ SB - (cosl>) 


V(r) = 

10.4.6 Prove the expansion 


1-,P 2 (cos0)- 


/ — dt. — —y — lna; — 

t n =1 


(~l) n x r ‘ 


n■ n\ 


for the exponential integral. Here, y is the Euler-Mascheroni constant. 
10.4.7 Show that E\ (z) may be written as 


Em = 


poo p—Zl 

= e~ z / -- dt. 

Jo 1 +1 


+1 

Show also that we must impose the condition |arg z\ < tz/2. 
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10.4.8 Related to the exponential integral [Eq. (10.62)] is the function 



Show that E n (x) satisfies the recurrence relation 

0C 

E n+1 (x) = -e~ x - E n (x), n = 1,2,3,.... 

n n 

10.4.9 With E n (x) defined in Exercise 10.4.8, show that E n Q 0) = l/(n — 1), 

n > 1. 

10.4.10 Using the relation 

T(a) = y(a, x) + r(a, x), 

show that if y(a,x) satisfies the relations of Exercise 10.4.2, then 
T(a, x) must satisfy the same relations. 

10.4.11 Calculate the potential produced by a Is hydrogen electron (Exercise 
10.4.4) (Fig. 10.9). Tabulate V (r) / (q / An eoaf) for 0.0 < x < 4.0, a; in 
steps of 0.1. Check your calculations for r <£ 1 and for r 1 by 
calculating the limiting forms given in Exercise 10.4.4. 

Figure 10.9 

Distributed Charge 
Potential Produced 
by a Is Hydrogen 
Electron (Exercise 
10.4.11) 



Additional Reading 


Abramowitz, M., and Stegun, I. A. (Eds.) (1972). Handbook of Mathematical 
Functions with Formulas, Graphs, and Mathematical Tables (AMS-55). 
National Bureau of Standards, Washington, DC. Reprinted, Dover (1974). 
Contains a wealth of information about gamma functions, incomplete 
gamma functions, exponential integrals, error functions, and related func¬ 
tions (Chapters 4-6). 

Artin, E. (1964). The Gamma Function (M. Butler, Trans.). Holt, Rinehart & 
Winston, New York. Demonstrates that if a function f(x ) is smooth (log 
convex) and equal to (n — 1)! when x = n — integer, it is the gamma 
function. 
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Davis, H. T. (1933). Tables of the Higher Mathematical Functions. Principia, 
Bloomington, IN. Volume 1 contains extensive information on the gamma 
function and the polygamma functions. 

Gradshteyn, I. S., and Ryzhik, I. M. (2000). Table of Integrals, Series, and 
Products, 6th ed. Academic Press, New York. 

Luke, Y. L. (1969). The Special Functions and Their Approximations, Vol. 1. 
Academic Press, New York. 

Luke, Y. L. (1975). Mathematical Functions and Their Approximations. Aca¬ 
demic Press, New York. This is an updated supplement to Handbook 
of Mathematical Functions with Formulas, Graphs, and Mathematical 
Tables (AMS-55). Chapter 1 deals with the gamma function. Chapter 4 
treats the incomplete gamma function and a host of related functions. 




Legendre Polynomials 
and Spherical 
Harmonics 


11.1 Introduction 


Legendre polynomials appear in many different mathematical and physical 

situations: 

• They originate as solutions of the Legendre ordinary differential equation 
(ODE), which we have already encountered in the separation of variables 
(Section 8.9) for Laplace’s equation, and similar ODEs in spherical polar 
coordinates. 

• They arise as a consequence of demanding a complete, orthogonal set 
of functions over the interval [—1, 1] (Gram-Schmidt orthogonalization; 
Section 9.3). 

• In quantum mechanics, they (really the spherical harmonics; Section 11.5) 
represent angular momentum eigenfunctions. They also appear naturally in 
problems with azimuthal symmetry, which is the case in the next point. 

• They are defined by a generating function: We introduce Legendre polyno¬ 
mials here by way of the electrostatic potential of a point charge, which 
acts as the generating function. 


Physical Basis: Electrostatics 


Legendre polynomials appear in an expansion of the electrostatic potential in 
inverse radial powers. Consider an electric charge q placed on the 2 -axis at 
z — a. As shown in Fig. 11.1, the electrostatic potential of charge q is 


1 q 

<p — - -■ — (SI units). 

47reo n 


( 11 . 1 ) 


We want to express the electrostatic potential in terms of the spherical polar 
coordinates r and 6 (the coordinate <p is absent because of azimuthal symmetry, 
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Figure 11.1 

Electrostatic 
Potential. Charge q 
Displaced from 
Origin 



that is, invariance under rotations about the s-axis). Using the law of cosines 
in Fig. 11.1, we obtain 

<p = Ci (r 2 + a 2 — 2arcos0) _1/2 . (11.2) 

4tt£o 


Generating Function 


Consider the case of r > a. The radical in Eq. (11.2) may be expanded in a 
binomial series (see Exercise 5.6.9) for r 2 > \ a 2 — 2ar cos 0 and then rear¬ 
ranged in powers of (a/r). This yields the coefficient 1 of (a/r)° = 1, cos 0 
as coefficient of a/r, etc. The Legendre polynomial P, 4 (cos0) (Fig. 11.2) is 
defined as the coefficient of (a/r)™ so that 


<P = 


Ajtsor z - n 
u n= 0 


I>(cos0)| 



(11.3) 


Dropping the factor q/A ttest and using x = cos 6 and t = a/r, respectively, we 
have 


g(t, x) = (1 - 2x1 + t 2 y 1/2 = W < !. (H-4) 

n=0 

defining g(t, x ) as the generating function for the P n (x). These polynomials 
P,i(x), shown in Table 11.1, are the same as those generated in Example 9.3.1 by 
Gram-Schmidt orthogonalization of powers x n over the interval —1 < x < 1. 
This is no coincidence because cos 9 varies between the limits ±1. In the next 
section, it is shown that | P n { cos 0) | < 1, which means that the series expansion 
[Eq. (11.4)] is convergent for |f| < 1. 1 Indeed, the series is convergent tor | / = 1 
except for x — ±1, where |P„(±1)| = 1. 


^ote that the series in Eq. (11.3) is convergent for r > a even though the binomial expansion 
involved is valid only for r > (cr+2ar) 1 / 2 > |a 2 — 2arcos#| 1 / 2 so that r 2 >o 2 + 2ar, (r—a) 2 > 2a 2 , 
or r > a(l + v^2). 
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Figure 11.2 
Legendre 

Polynomials P>(x), 
Pi O), and 

P-Ax) 


PJA) 



Table 11.1 

Legendre Polynomials 


P 0 (x) = 1 

Pl(x) = X 
P 2 (x) = |(3x 2 - 1) 

Ps(x) = 2 (5x 3 — 3x) 

P 4 (x) = l(35A - 30x 2 + 3) 

P 5 (x) = 8 (63x B — 70x 3 + 15x) 

P 6 (x) = ^(23 lx 6 - 315x 4 + 105x 2 - 5) 

Pi {pc) = ^(429x 7 - 693x B + 315x 3 - 35x) 

P 8 (x) = - 12012x 6 + 6930x 4 - 1260x 2 + 35) 


In physical applications, such as the Coulomb or gravitational potentials, 
Eq. (11.4) often appears in the vector form 


= [rf + rf - 2ri ■ r 2 ] 1/2 = — 

L J ^ 


1+ ( - ) -2( - ) cost 

n) Vn, 


- 1/2 


l r i - r 2 | 
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The last equality is obtained by factoring r\ = | r 1 1 from the denominator, which 
then, for r\ > r 2 , we can expand according to Eq. (11.4). In this way, we obtain 


where 


and 


1 

|ri -r 2 | 



n 

PnicOSO), 


(11.4a) 


r> = |ri| I 
r< = |r 2 | } 


for |n| > |r 2 |, 


(11.4b) 


r> = |r 2 | } 
r< = Ini I 


for |r 2 | > |nl- 


(11.4c) 


EXAMPLE 11.1.1 


Special Values A simple and powerful application of the generating function 
g is to use it for special values (e.g., x — ± 1) where g can be evaluated explicitly. 
If we set x = 1, that is, point z = a on the positive 2 -axis, where the potential 
has a simple form, Eq. (11.4) becomes 


1 

( 1 - 2 1 + t 2 y / 2 


i 

i - 1 


£*’ 


n= 0 


(11.5) 


using a binomial expansion or the geometric series (Example 5.1.2). However, 
Eq. (11.4) for x = 1 defines 


1 

( 1 - 2 1 + t 2 y / 2 


£ p n mt n . 

n= 0 


Comparing the two series expansions (uniqueness of power series; Section 
5.7), we have 


P„(l) = 1. 


( 11 . 6 ) 


If we let x = —1 in Eq. (11.4), that is, point z = —a on the negative 2 -axis in 
Fig. 11.1, where the potential is simple, then we sum similarly 


so that 


1 

(1 + 2 1 + t 2 y / 2 


i 

1 + t 


n= 0 


(11.7) 


P„(-l) = (-1)". 


( 11 . 8 ) 


These general results are more difficult to develop from other formulas for 
Legendre polynomials. 

If we take x = 0 in Eq. (11.4), using the binomial expansion 


(1 


t 2 r 


1/2 _ 


1 — gt 2 + it 4 + • • • + (—1)" 


1-3- 


•( 2to ~ P> t 2n [ 


(11.9) 


2 n n\ 
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we have 2 


fltaC 0) = (-1)" 


1 ■ 3 • • ■ (2m — 1) 
2 “m! 


= (-1) M 


(2 n- 1)!! 
(2m)!! 


(—1)"(2m)! 
2 2 ™(m!) 2 


( 11 . 10 ) 


P‘ 2 n+m = o, n= 0,1,2.... (11.11) 

These results can also be verified by inspection of Table 11.1. 


EXAMPLE 11.1.2 


Parity If we replace x by —x and t by —t, the generating function is un¬ 
changed. Hence, 


g(t, x ) = g(-t, -x~) = [1 - 2(-t)(-x) + (-t) 2 ] 1/2 


OO OO 

= Pn(-X)(-ty = p n(x)t n . ( 11 . 12 ) 

n= 0 n= 0 

Comparing these two series, we have 

P n (-x) = (-Yj n P n (xy, (11.13) 


that is, the polynomial functions are odd or even (with respect to x = 0) 
according to whether the index n is odd or even. This is the parity 3 or reflec¬ 
tion property that plays such an important role in quantum mechanics. Parity 
is conserved when the Hamiltonian is invariant under the reflection of the 
coordinates r —^ —r. For central forces the index n is the orbital angular 
momentum [and n(n + 1) is the eigenvalue of L 2 ], thus linking parity and or¬ 
bital angular momentum. This parity property will be confirmed by the series 
solution and for the special cases tabulated in Table 11.1. ■ 


Power Series 


Using the binomial theorem (Section 5.6) and Exercise 10.1.15, we expand the 
generating function as 


(i - 2 xt + t 2 r l/2 


OO 


E 

n= 0 


(2m)! 

2 2,i (m!) 2 


( 2 xt - t 2 T 


OO 

= 1+ E 


n =1 


(2m- 1)!! 
(2m)!! 


(2xt - t 2 ) n . 


(11.14) 


Before we expand (2xt — I 2 )" further, let us inspect the lowest powers of t. 


2 The double factorial notation is defined in Section 10.1: 

(2 n)\\ = 2 • 4 ■ 6 • • • (2w), (2 n- 1)!! = 1 • 3 • 5 -• • (2n - 1). 

3 In spherical polar coordinates the inversion of the point (r, 6, (p ) through the origin is accom¬ 
plished by the transformation [r —► r, 6 ->• tz — 9, and <p —> <p ± x]. Then, cos@ —> cos(jt — 6) = 
— cos 9, corr esponding to x -»• — x. 
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EXAMPLE 11.1.3 


Lowest Legendre Polynomials For the first few Legendre polynomials 
(e.g., Pq, Pi, and P 2 ), we need the coefficients of £°, t 1 , and t 2 in Eq. (11.14). 
These powers of t appear only in the terms n— 0, 1, and 2; hence, we may limit 
our attention to the first three terms of the infinite series: 


0 ! 

2°(0!) 2 


(2 xt - t 2 f + 


2! 9 , 4! 

2 2 (1!) 2 ^ Xt ~ 1 ^ + 2 4 * (2!) 2 

= u ° +xt ' + &-i) 


(2 xt - t 2 f 
t 2 + 0(t 3 ). 


Then, from Eq. (11.4) (and uniqueness of power series) we obtain 

P 0 (x) = 1, PiOr) = x, P 2 (x) = ^ x 2 - (11.15) 

confirming the entries of Table 11.1. We repeat this limited development in a 
vector framework later in this section. ■ 


In employing a general treatment, we find that the binomial expansion of 
the (2 xt — t 2 ) 71 factor yields the double series 




n\ 


(2 xT~ k t k 


71 = 0 
00 n 


k =0 


k\(n — k)\ 




(2 n)\ 


(2 x) n - k t n+k . (11.16) 


n= 0 k =0 


2 2n n\k\(n — A:)! 


By rearranging the order of summation (valid by absolute convergence), Eq. 
(11.16) becomes 


00 [n/2] 

(i—2 xt+t 2 y 1/2 = E 

71 = 0 k =0 


(2 n — 2 ky. 


(2 x) n - 2k t n , (11.17) 


2 2n ~ 2k k\(n — k)\(n — 2 k)\ 

with the t n independent of the index k 4 Now, equating our two power series 
[Eqs. (11.4) and (11.17)] term by term, we have 6 


In/ 2] 

Pn(2 0 ='£ i (r 
k =0 




(2 n - 2 k)\ 


-X 


, 71 —2k 


2 n k\(n — ky.(n — 2 k)! 


(11.18) 


We read off this formula, from k = 0, that the highest power of P„(x) is x n , 
and the lowest power is x° = 1 for even n and x for odd n. This is consistent 
with Example 11.1.3 and Table 11.1. Also, for n even, P n has only even powers 
of x and thus even parity [see Eq. (11.13)] and odd powers and odd parity for 
odd n. 


4 [m/2] = n/2 for n even, (n — l)/2 for n odd. 

B Equation (11.18) starts with x n . By changing the index, we can transform it into a series that 

starts with x° for n even and x 1 for n odd. 
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Biographical Data 

Legendre, Adrien Marie. Legendre, a French mathematician who was 
bom in Paris in 1752 and died there in 1833, made major contributions 
to number theory, elliptic integrals before Abel and Jacobi, and analysis. 
He tried in vain to prove the parallel axiom of Euclidean geometry. His 
taste in selecting research problems was remarkably similar to that of his 
contemporary Gauss, but nobody could match Gauss’s depth and perfection. 
His great textbooks had enormous impact. 


Linear Electric Multipoles 


Returning to the electric charge on the 2 -axis, we demonstrate the usefulness 
and power of the generating function by adding a charge — q at z = —a, as 
shown in Fig. 11.3, using the superposition principle of electric fields. The 
potential becomes 


cp 


q 


4tt£o \ri 

and by using the law of cosines, we have for r > a 
Q 


(p = 


47 rs 0 r 


„ a 

2- cosi 
r 


(a\ 2 ] 

- 1/2 




1 


a 

2- cosd 
r 


(11.19) 


- 1/2 


where the second radical is like the first, except that a has been replaced by 
—a. Then, using Eq. (11.4), we obtain 


q 

(p = - 

4jreor 


OO 

^2 Pnicos) 
n= 0 


n 00 /a\ n 

- J2 p ^ cos9 ^~ i y i [~) 

n =0 


2 q 

ijtsor 


Pi (cos 0)- + P 3 (cos0) 
r 



( 11 . 20 ) 


Figure 11.3 



Electric Dipole 
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The first term (and dominant term for r a) is the electric dipole potential 


2 aq I\ (cos 0) 
4neo r 2 


( 11 . 21 ) 


with 2aq the electric dipole moment (Fig. 11.3). If the potential in Eq. (11.19) 
is taken to be the dipole potential, then Eq. (11.21) gives its asymptotic behavior 
for large r. This analysis may be extended by placing additional charges on the 
2 -axis so that the P\ term, as well as the Pq (monopole) term, is canceled. 
For instance, charges of q at z — a and z = —a, —2 q at z = 0 give rise to a 
potential whose series expansion starts with P 2 (cos 0). This is a linear electric 
quadrupole. Two linear quadrupoles may be placed so that the quadrupole 
term is canceled, but the P 3 , the electric octupole term, survives, etc. These 
expansions are special cases of the general multipole expansion of the electric 
potential. 


Vector Expansion 


We consider the electrostatic potential produced by a distributed charge p 0 * 2 ): 

PO 2 ) 


<pOt) = 


1 /■ PO: 

47T£o J |ri - 


r 2 l 


-dr 2 . 


( 11 . 22 a) 


Taking the denominator of the integrand, using first the law of cosines and 
then a binomial expansion, yields 


1 


|ri -r 2 | 


(rf - 2 rj ■ r 2 + r|) 1/2 


(11.22b) 



2 ri • r 2 

'l 



for /'1 > r 2 


1 

n 


1*1 ■ 1*2 

r? 


lr[ 
2 rf 


3 (n ■ r 2 ) 2 
2 r\ 



For ri = \,n = t, and n • r 2 = xt, Eq. (11.22b) reduces to the generating 
function, Eq. (11.4). 

The first term in the square bracket, 1, yields a potential 


<Po(ri) = 


1 1 

4tt£o n 


J p(r 2 )dr 2 . 


( 11 . 22 c) 


The integral contains the total charge. This part of the total potential is an 
electric monopole. 

The second term yields 

<pi(ri) = —f r 2 p(r 2 )dr 2 , ( 11 . 22 d) 

47 T £ 0 ry J 

where the integral is the dipole moment whose charge density p (r 2 ) is weighted 
by a moment arm r 2 . We have an electric dipole potential. For atomic or nu¬ 
clear states of definite parity, p(r 2 ) is an even function and the dipole integral 
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is identically zero. However, in the presence of an applied electric field a super¬ 
position of odd/even parity states may develop so that the resulting induced 
dipole moment is no longer zero. The last two terms, both of order (r 2 /r {f, 
may be handled by using Cartesian coordinates 


3 


3 


Or • r 2 ) 2 = ^xiiXzi^xijXij. 

i= 1 i=1 

Rearranging variables to keep the X 2 inside the integral yields 



(11.22e) 


This is the electric quadrupole term. Note that the square bracket in the 
integrand forms a symmetric tensor of zero trace. 

A general electrostatic multipole expansion can also be developed by 
using Eq. (11.22a) for the potential <p(n) and replacing 1 /(4 tt |ri — r 2 |) by a 
(double) series of the angular solutions of the Poisson equation (which are the 
same as those of the Laplace equation of Section 8.9). 

Before leaving multipole fields, we emphasize three points: 

• First, an electric (or magnetic) multipole has a value independent of the 
origin (reference point) only if all lower order terms vanish. For instance, 
the potential of one charge q at z — a was expanded in a series of Legendre 
polynomials. Although we refer to the /\(cosd) term in this expansion as 
a dipole term, it should be remembered that this term exists only because 
of our choice of coordinates. We actually have a monopole, Po(cos 0), the 
term of leading magnitude. 

• Second, in physical systems we rarely encounter pure multipoles. For 
example, the potential of the finite dipole (q at z = a, —q at z = — a ) 
contained a Ps^cosd) term. These higher order terms may be eliminated 
by shrinking the multipole to a point multipole, in this case keeping the 
product qa constant (a -> 0, q -> oo) to maintain the same dipole moment. 

• Third, the multipole expansion is not restricted to electrical phenomena. 
Planetary configurations are described in terms of mass multipoles. 

It might also be noted that a multipole expansion is actually a decomposition 
into the irreducible representations of the rotation group (Section 4.2). The 
Zth multipole involves the eigenfunctions of orbital angular momentum, | Im), 
one for each component m of the multipole l (see Chapter 4). These 21 + 1 
components of the multipole form an irreducible representation because the 
lowering operator L_ applied repeatedly to the eigenfunction \ll > generates all 
other eigenfunctions | Im), down to to = —l. The raising and lowering opera¬ 
tors L± are generators of the rotation group along with L z , whose eigenvalue 


is to. 
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EXERCISES 

11.1.1 Develop the electrostatic potential for the array of charges shown. This 
is a linear electric quadrupole (Fig. 11.4). 

Figure 11.4 

Linear Electric 
Quadrupole 


Q -2 q 

A_1 

Q 

w \ 

z = -a 

w W ~ 

z = a 


11.1.2 Calculate the electrostatic potential of the array of charges shown 
in Fig. 11.5. This is an example of two equal but oppositely directed 
dipoles. The dipole contributions cancel, but the octupole terms do 
not cancel. 


Figure 11.5 

Linear Electric 
Octupole 


-q +2 q 

-2 q q 

II 

Jo' 

?! 

C 

a 2 a 


11.1.3 Show that the electrostatic potential produced by a charge q at z = a 
for r < a is 


<Kr) = 


Q 

47T£0« 



n 

P n (cos9). 


11.1.4 Using E = —Vq>, determine the components of the electric field corre¬ 
sponding to the (pure) electric dipole potential 


(p{r) = 


2 aqPi(cos0) 

47T£ 0 r 2 


Here, it is assumed that r a. 


ANS. 


E r — + 


4 aq cos 6 
47r £ or 3 


Eg = 


2 aq sin 0 
4jr£or 3 ’ 


Ey = 0. 


11.1.5 A point electric dipole of strength p <r) is placed at z = a; a second point 
electric dipole of equal but opposite strength is at the origin. Keeping 
the product jphi constant, let a -» 0. Show that this results in a point 
electric quadrupole. 
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11.1.6 A point charge q is in the interior of a hollow conducting sphere of 
radius ro. The charge q is displaced a distance a from the center of the 
sphere. If the conducting sphere is grounded, show that the potential 
in the interior produced by q and the distributed induced charge is the 
same as that produced by q and its image charge q'. The image charge is 
at a distance a' = rfi la from the center, collinear with q and the origin 
(Fig. 11.6). 


Figure 11.6 
Image Charge q' 



Hint. Calculate the electrostatic potential for a < r 0 < a!. Show that 
the potential vanishes for r = r 0 if we take q' = —qr Q /a. 

11.1.7 Prove that 


y.n+\ 

P n { cosd) = (-ly¬ 
re 


d n 
dz » 



Hint. Compare the Legendre polynomial expansion of the generating 
function (a —> Az; Fig. 11.1) with a Taylor series expansion of 1/r, 
where z dependence of r changes from z to z — Az (Fig. 11.7). 


Figure 11.7 

Geometry for 
z —z — Az 



11.1.8 By differentiation and direct substitution of the series form, Eq. (11.18), 
show that P n (x) satisfies Legendre’s ODE. Note that we may have any 
x, — oo < x < oo and indeed any z in the entire finite complex plane. 
























11.2 Recurrence Relations and Special Properties 


563 


0 


11.2 Recurrence Relations and Special Properties 


J 


Recurrence Relations 


The Legendre polynomial generating function provides a convenient way of de¬ 
riving the recurrence relations 6 and some special properties. If our generating 
function [Eq. (11.4)] is differentiated with respect to t, we obtain 


dg(t, x) 


x— t 


M + (1L23) 

By substituting Eq. (11.4) into this and rearranging terms, we have 

OO OO 

(1 - 2 xt + t 2 ) J2 nP n (x)t n ~ l + (t-x)J2 p »(x)t n = 0. (11.24) 


n= 0 


n— 0 


The left-hand side is a power series in t. Since this power series vanishes for all 
values of f, the coefficient of each power of t is equal to zero; that is, our power 
series is unique (Section 5.7). These coefficients are found by separating the 
individual summations and using appropriate summation indices as follows: 


OO OO OO 

- 7^ 2 nxP n (x)t n + 7^ sP s(x)t s+1 

m =0 n= 0 s=0 


OO OO 

+ J2 Psix)t s+1 - J2 xPn(x)t n = 0. 

s =0 n= 0 


Now letting m—n+l, s — n — l, we find 

(2 n + iyxP n (x) = (n+ l)P n+ i(x) + nP n - i(x), n= 1,2,3, — 


(11.25) 


(11.26) 


With this three-term recurrence relation we may easily construct the higher 
Legendre polynomials. If we take n — 1 and insert the values of /-’) (x) and 
/-) (.%') [Exercise 11.1.7 or Eq. (11.18)], we obtain 

1 9 

3xPi(x) = 2P 2 (x) + Po{pc), or P 2 (x) = ~(3x 2 — 1). 

t-i 

This process may be continued indefinitely; the first few Legendre polynomials 
are listed in Table 11.1. 

Cumbersome as it may appear at first, this technique is actually more effi¬ 
cient for a computer than is direct evaluation of the series [Eq. (11.18)]. For 
greater stability (to avoid undue accumulation and magnification of round off 
error), Eq. (11.26) is rewritten as 

P n +i(x) = 2 xP n (x) - P n -\(x) - [xP n (x) - P„-i(x)]/(n + 1). (11.26a) 


6 We can also apply the explicit series form [Eq. (11.18)] directly. 
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One starts with Pq(x) = 1, I\ (x) = x, and computes the numerical values 
of all the P„(x) for a given value of x, up to the desired PnQP). The values of 
Pi,(x), 0 < n < N are available as a fringe benefit. 

To practice, let us derive another recursion relation from the generating 
function. 


EXAMPLE 11.2.1 


Recursion Formula Consider the product 

g(t, x)g(t, —x) = (1 — 2 xt + f 2 ) -1/z (l + 2 xt + f 2 )~ 1/2 

= [(1 + t 2 f - 4x 2 t 2 ]~ 1/2 = [£ 4 + 2£ 2 (1 - 2a; 2 ) + 1]" 1/2 

and recognize the generating function, upon replacing t 2 -> t, 2x 2 — 1 -> 
Using Eq. (11.4) and comparing coefficients of the power series in t we there¬ 
fore have derived 

g(t, x)g(t, -x) = Y P m (x)P n (-x)t m+n = Y Pn&x 2 - 1 )t 2N , 

m,n N 

or, for m+ n = 2N and m+n= 2N — 1, respectively, 


2 N 

P N (2X 2 - 1) = Y P 2 N-n(x)Pn{-X), 


n= 0 


(11.27a) 


2 N —1 

Y P2N-n-l(x)Pn(~x) = 0 . 

n = 0 

For iV = 1 we check first that 

l 

Y P\-v(x)P n (-X) = x-l— l-X=Q 
n= 0 

and second that 

2 /3 

Pl(2x 2 - 1) = Y P 2 -n(x)P n (-X) = x-x-x 2 + 2( -x 2 

n= 0 


(11.27b) 


= 2a; 2 - 1. ■ 


Differential Equations 


More information about the behavior of the Legendre polynomials can be 
obtained if we now differentiate Eq. (11.4) with respect to x. This gives 


dg(t, x~) 
dx 


t 


(1 — 2a;£ + £ 2 ) 3 / 2 


= Y K&w 


(11.28) 


n= 0 


or 


(1 - 2a;£ + t 2 ) Y ~ 1 Y (11.29) 

n —0 n =0 

As before, the coefficient of each power of t is set equal to zero and we obtain 
K +1 (X) + P'n- iW = 2 xP'tx) + P n (x). (11.30) 
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A more useful relation may be found by differentiating Eq. (11.26) with 
respect to x and multiplying by 2. To this we add (2n + 1) times Eq. (11.30), 
canceling the P' n term. The result is 


K+ i(aO - Pi- 1(*) = (2 n + l)P n (x). (11.31) 

From Eqs. (11.30) and (11.31) numerous additional equations may be deve¬ 
loped, 7 including 


P' n+l (x) = (n+ l)P,lx) + xP' n (x), (11.32) 

P'_i(i) = -nP n (x) + xP'lx ), (11.33) 

(1 - x * 2 )P' n (x ) = nP n -i(x) - nxP n (x), (11.34) 

(1 - x 0 m;ix~) = (n+ l)xPn(x) (n + l)P n+1 (a;). (11.35) 


By differentiating Eq. (11.34) and using Eq. (11.33) to eliminate P' n lx), we 
find that P n (pc) satisfies the linear, second-order ODE 

(1 - x 2 )P;Xx) - 2xP:„(x) + n(n + l)P„(a;) = 0 


or 


d 

dx 


(1 - x 2 ) 


dP n (x) 

dx 


+ n(n + 1 )P„(a;) = 0. 


(11.36) 


In the second form the ODE is self-adjoint. The previous equations, Eqs. 
(11.30)—(11.35), are all first-order ODEs but with polynomials of two differ¬ 
ent indices. The price for having all indices alike is a second-order differential 
equation. Equation (11.36) is Legendre’s ODE. We now see that the polynomi¬ 
als Plx) generated by the power series for (1 — 2 xt + 1 2 )~ 1/2 satisfy Legendre’s 
equation, which, of course, is why they are called Legendre polynomials. 

In Eq. (11.36) differentiation is with respect to x (= cos 0). Frequently, 
we encounter Legendre’s equation expressed in terms of differentiation with 
respect to 6: 

1 d ( dP n (cos6)\ 

( sin0 ^H) +TO(W+ 1)P » (COS0) = °- (1L37) 


7 Using the equation number in parentheses to denote the left-hand side of the equation, we may 
write the derivatives as 


2 • (11.26) + (2n+ 1) ■ (11.30) => (11.31) 

2 {(11.30) + (11.31)} => (11.32) 
\ {(11.30) - (11.31)} => (11.33) 
(11.32) r ^.„_i + x ■ (11.33) => (11.34) 
Jj(11.34) + n- (11.33) =>(11.36). 
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Upper and Lower Bounds for P n (cos 6 ) 


Finally, in addition to these results, our generating function enables us to set 
an upper limit on |P, ! (cos0)|. We have 


(1 - 2 1cos 9 + t 2 y 1/2 = (1 - te ie y 1/2 ( 1 - te~ w y 1/2 

= (1 + \te ie + | t 2 e 2ie H-) 

•(1 + \te~ w + lt 2 e~ 2iB + ■■■), (11.38) 

with all coefficients positive. Our Legendre polynomial, P n (cos 0), still the 
coefficient of t n , may now be written as a sum of terms of the form 

\a m (e imB + e _im0 ) = a m cosmO (11.39a) 

with all the a m positive. Then 

n 

P„Q cos0) = ^2 a m cosmd. (11.39b) 

m= 0 or 1 

This series, Eq. (11.39b), is clearly a maximum when 0 = 0 and all cos mG = 1 
are maximal. However, for x = cos 0 = 1, Eq. (11.6) shows that P m ( 1) = 1. 
Therefore, 


|P„(cos0)| < pyi) = 1. (11.39c) 

A fringe benefit of Eq. (11.39b) is that it shows that our Legendre polynomial 
is a linear combination of cos mO. This means that the Legendre polyno¬ 
mials form a complete set for any functions that may be expanded in 
series of cos mO over the interval [0, jr ]. 


SUMMARY 


In this section, various useful properties of the Legendre polynomials are de¬ 
rived from the generating function, Eq. (11.4). The explicit series representa¬ 
tion, Eq. (11.18), offers an alternate and sometimes superior approach. 


EXERCISES 
11.2.1 Given the series 

a 0 + a 2 cos 2 0+0-4 cos 4 0+0-6 cos 6 0 = ao-Po + d 2 p 2 + CG-Pi + a&P&, 

express the coefficients a -, as a column vector a and the coefficients 
a, as a column vector a and determine the matrices A and B such that 

A a = a and Ba = a. 

Check your computation by showing that AB = 1 (unit matrix). Repeat 
for the odd case 

oq cos 0+0-3 cos 3 0 + 0-6 cos 6 0 + oii cos 7 0 = cp-Pi + 0^3 + a§P§ + a7P7. 
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Note. P„(c:os 0) and cos" 0 are tabulated in terms of each other in 
AMS-55. 


11.2.2 By differentiating the generating function, g(t, x), with respect to t, 
multiplying by 2£, and then adding g(t,x), show that 

1 _ t 2 00 

(l-2to+^ = S (2 ” + 1)P -«’ 

This result is useful in calculating the charge induced on a grounded 
metal sphere by a point charge q. 

11.2.3 (a) Derive Eq. (11.35) 

(1 - x 2 ~)P„(x) = (n+ l)xP n (x) ~(n+ l)P n+ i(x). 


(b) Write out the relation of Eq. (11.35) to preceding equations in sym¬ 
bolic form analogous to the symbolic forms for Eqs. (11.31)—(11.34). 

11.2.4 A point electric octupole may be constructed by placing a point elec¬ 
tric quadrupole (pole strength jfP in the ^-direction) at z = a and an 
equal but opposite point electric quadrupole at z = 0 and then letting 
a -»■ 0, subject to p <2) a = constant. Find the electrostatic potential cor¬ 
responding to a point electric octupole. Show from the construction 
of the point electric octupole that the corresponding potential may be 
obtained by differentiating the point quadrupole potential. 


11.2.5 Operating in spherical polar coordinates, show that 


3 " P„(cos 0) 
3 z r n+l 


~(n+ 1) 


P n+ i(cos6Q 


This is the key step in the mathematical argument that the derivative 
of one multipole leads to the next higher multipole. 

Hint. Compare Exercise 2.5.12. 


11.2.6 From 

(cos 0) = ^-^(1 - 2tcos0 + 1 2 )“ 1/2 
L\ 3 t L 

show that 


Pl(X)= 1, Pi(-l) = (-1) L . 


11.2.7 Prove that 

KW= l PniX) \^ = ^ n(n+1) - 

11.2.8 Show that P„(cos 0) = (— 1)”P»(— cos 0) by use of the recurrence rela¬ 
tion relating P„, P n +\, and P„-\ and your knowledge of Pq and Pi. 

11.2.9 FromEq. (11.38) writ e out the coefficient of t 2 in terms of cos nO, n < 2. 
This coefficient is P2(cos0). 
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11.3 Orthogonality 


Legendre’s ODE [Eq. (11.36)] may be written in the form (Section 9.1) 

[(1 - x^P' n (x)\ + n(n + l)P n (_x) = 0, (11.40) 

dx 

showing clearly that it is self-adjoint. Subject to satisfying certain boundary 
conditions, then, we know that the eigenfunction solutions P„(x) are orthog¬ 
onal. Upon comparing Eq. (11.40) with Eqs. (9.6) and (9.8) we see that the 
weight function wQx) = 1, £ = (d/dx)(l — x 2 )(d/dx), p(x) = 1 — x 2 and the 
eigenvalue X = n(n+ 1). The integration limits on x are ±1, where p(± 1) = 0. 
Then for m^n, Eq. (9.34) becomes 



Pnix^PmQr) dX — 0, 8 



/ , „(cosd)P m (cos@)sin0 dO = 0, 


(11.41) 

(11.42) 


showing that P n (x) and P m (x ) are orthogonal for the interval [—1, 1]. 

We need to evaluate the integral [Eq. (11.41)] when n= m. Certainly, it is 
no longer zero. From our generating function 


(1-2 tx+t 2 )- 1 


OO 

X p «(&f 

n= 0 


Integrating from x = — 1 to x = +1, we have 



dx 

1 — 2 tx + t 2 



[P n (x)] 2 dx. 


(11.43) 


(11.44) 


The cross terms in the series vanish by means of Eq. (11.41). Using y — 1 — 
2 tx+ t 2 , we obtain 


r 1 dx if 

J- 1 1-2 tx+ t 2 2 1 Jn 


(1+i)2 dy 1 

— = - In 
(i~tf y t 


1 + t 
1 - 1 


Expanding this in a power series (Exercise 5.4.1) gives us 


(11.45) 



t 2n 

2n+ 1' 


(11.46) 


8 In Section 9.4 such integrals are intepreted as inner products in a linear vector (function) space. 
Alternate notations are 


L 


1 

-1 


P n (x)P m (x) dx = {P n (x)\P m (x)) = CP n (pc ), P m {xj). 


The () form, popularized by Dirac, is common in physics literature. The form (, ) is more common 
in mathematics literature. 
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Comparing power series coefficients of Eqs. (11.44) and (11.46), we must have 

f\Pn(x)] 2 dx=-^—. (11.47) 

J -i 2n+l 

Combining Eq. (11.41) with Eq. (11.47) we have the orthonormality condition 

[ 1 P m (x)P n (x) dx = . (11.48) 

J -1 2n+ 1 

Therefore, P n are not normalized to unity. We return to this normalization 
in Section 11.5, when we construct the orthonormal spherical harmonics. 


Expansion of Functions, Legendre Series 


In addition to orthogonality, the Sturm-Liouville theory shows that the Legen¬ 
dre polynomials form a complete set. Let us assume, then, that the series 


OO 

Y a n P n (x) = fix'), Or | f) = Y a n\ P n)> 

n= 0 n 


(11.49) 


defines fix) in the sense of convergence in the mean (Section 9.4) in the inter¬ 
val [— 1, 1 ]. This demands that fix) and /' ix) be at least sectionally continuous 
in this interval. The coefficients a n are found by multiplying the series by P m ix) 
and integrating term by term. Using the orthogonality property expressed in 
Eqs. (11.42) and (11.48), we obtain 

„ 2 , a m = f P m ix)fix)dx= (P m \f) = Y\ a n(Pm\P-n)- (11.50) 
2m+ 1 J_ i 


We replace the variable of integration x by t and the index m by n. Then, 
substituting into Eq. (11.49), we have 

00 2n+ 1 / f 1 \ 

fix) = Y [J i /CO P nit) dtj Pnix). (11.51) 


This expansion in a series of Legendre polynomials is usually referred to 
as a Legendre series. 9 Its properties are quite similar to the more familiar 
Fourier series (Chapter 14). In particular, we can use the orthogonality prop¬ 
erty [Eq. (11.48)] to show that the series is unique. 

On a more abstract (and more powerful) level, Eq. (11.51) gives the repre¬ 
sentation of / ix) in the linear vector space of Legendre polynomials (a Hilbert 
space; Section 9.4). 

Equation (11.51) may also be interpreted in terms of the projection oper¬ 
ators of quantum theory. We may define a projection operator 

2m+ 1 C 1 

Vrn = Pmix) -^- J PJf)[]dt 


9 Note that Eq. (11.50) gives a m as a definite integral, that is, a number for a given f{x). 
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as an (integral) operator, ready to operate on /(£). [The / (t) would go in the 
square bracket as a factor in the integrand.] Then, from Eq. (11.50) 

V m f = a m P m (x) w 

The operator V m projects out the mth component of the function /. 


EXAMPLE 11.3.1 


Legendre Expansion Expand f(x) = x(x + i)(x — 1) in the interval — 1 < 

x < 1. 

Because /(a;) is odd under parity and is a third-order polynomial, we expect 
only Pi, P 3 . However, we check all coefficients: 

1 

= 0, also by parity, 


2«o = J (a? 3 — x) dx = 
2 

3 ai 


-r 4 - -x 2 
4 X 2 X 


1 5 1 3 ' 

_/yC-* _ _/y*'-’ 

5 3 


2 2 
5 “ 3 


r ®2 = ^ 

5 


:«3 


= J (a; 4 — a: 2 ) dx = 

1 f 1 n 9 

= - / (x — x)(3x — l )dx = 0, by parity; 

2 J- 1 

= \ l (x 2 — x')(5x' i — 3x) dx = l. f 

2 1 2 

1 f5 7 

= x x,x - 

2 L7 


4 

15’ 


(5a: fa - 8a: 4 + 3ar) dx 


8 6 3 
-ar + x A 
5 


_ 5 _ 8 4 

_ 7 “ 5 + _ 35' 


Finally, using cq, a 3 , we verify that — \x + i (5.x 3 — 3a;) = x(x 2 — 1). 


Equation (11.3), which leads directly to the generating function definition 
of Legendre polynomials, is a Legendre expansion of 1/ri. Going beyond a 
simple Coulomb field, the l/r 12 is often replaced by a potential V(|ri — r 2 |) 
and the solution of the problem is again effected by a Legendre expansion. 

The Legendre series, Eq. (11.49), has been treated as a known function 
f(x) that we arbitrarily chose to expand in a series of Legendre polynomials. 
Sometimes the origin and nature of the Legendre series are different. In the next 
examples we consider unknown functions we know can be represented by 
a Legendre series because of the differential equation the unknown funct ions 
satisfy. As before, the problem is to determine the unknown coefficients in the 
series expansion. Here, however, the coefficients are not found by Eq. (11.50). 
Rather, they are determined by demanding that the Legendre series match a 
known solution at a boundary. These are boundary value problems. 


EXAMPLE 11.3.2 


Sphere in a Uniform Field Another illustration of the use of Legendre 
polynomials is provided by the problem of a neutral conducting sphere (radius 
ro) placed in a (previously) uniform electric field (Fig. 11.8). The problem is to 


10 The dependent variables are arbitrary. Here, x came from the x in V m . 
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Figure 11.8 


Conducting Sphere 
in a Uniform Field 



v= o 


find the new, perturbed, electrostatic potential. The electrostatic potential 11 
V satisfies 


V 2 V = 0, 


(11.52) 


Laplace’s equation. We select spherical polar coordinates because of the spher¬ 
ical shape of the conductor. (This will simplify the application of the boundary 
condition at the surface of the conductor.) We can write the unknown potential 
V (r, 0) in the region outside the sphere as a linear combination of solutions 
of the Laplace equation, called harmonic polynomials (check by applying the 
Laplacian in spherical polar coordinates from Chapter 2): 



(11.53) 


No <p dependence appears because of the axial (azimuthal) symmetry of 
our problem. (The center of the conducting sphere is taken as the origin and 
the 2-axis is oriented parallel to the original uniform field.) 

It might be noted here that n is an integer because only for integral n is the 
6 dependence well behaved at cos 0 = ±1. For nonintegral n, the solutions 
of Legendre’s equation diverge at the ends of the interval [—1, 1], the poles 
6 = 0,7r of the sphere (compare Exercises 5.2.11 and 8.5.5). It is for this same 
reason that the irregular solution of Legendre’s equation is also excluded. 


11 It should be emphasized that this is not a presentation of a Legendre series expansion of a 
known V(cos 9). Here, we deal with a boundary value problem of a partial differential equation 
(see Section 8.9). 
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Now we turn to our (Dirichlet) boundary conditions to determine the un¬ 
known a n and b n of our series solution, Eq. (11.53). If the original unperturbed 
electrostatic field is Eq = |Eo|, we require, as one boundary condition, 

V(v —> oo) = —Eqz = — E 0 r cos 6 = — E 0 rP^cosd). (11.54) 

Since our Legendre series is unique, we may equate coefficients of JP„(cos 0) 
in Eq. (11.53) (r —► oo) and Eq. (11.54) to obtain 

a n — 0, n > 1 and n— 0, «i = —Eq. (11.55) 

If a n ^ 0 for n > 1, these terms would dominate at large r and the boundary 
condition [Eq. (11.54)] could not be satisfied. 

As a second boundary condition, we may choose the conducting sphere 
and the plane 0 = jt/2 to be at zero potential, which means that Eq. (11.53) 
now becomes 


bo (b\ \ P n ( cos0) 

V(r = r 0 ) = — + I -2 — Eoro ) Pi(cos0) + ^ b„ -^ — = 0. (11.56) 

r 0 \ r 0 / n =2 r 0 

In order that this may hold for all values of 6 , each coefficient of P„(cos 0) 
must vanish. 12 Hence, 

bo = 0, 13 b n = 0, n> 2, (11.57) 

whereas 

b\ = Eorl (11.58) 

The electrostatic potential (outside the sphere) is then 

V = -EorP^cosO) + P i(cos0) = -^^(cosd) (l - %\ (11.59) 

\ y 6 J 

It can be shown that a solution of Laplace’s equation that satisfies the boundary 
conditions over the entire boundary is unique. The electrostatic potential V, as 
given by Eq. (11.59), is a solution of Laplace’s equation. It satisfies our boundary 
conditions and therefore is the solution of Laplace’s equation for this problem. 

It may further be shown (Exercise 11.3.13) that there is an induced surface 
charge density 


a = —so 


dV 

dr 


= 3eoE 0 cos0 


(11.60) 


12 Again, this is equivalent to saying that a series expansion in Legendre polynomials (or any 
complete orthogonal set) is unique. 

13 The coefficient of Pq is bo/ro. We set bo = 0 since there is no net charge on the sphere. If there 
is a net charge q, then bo / 0. 
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on the surface of the sphere and an induced electric dipole moment 

P = 47Tr*e 0 E 0 . (11.61) 


EXAMPLE 11.3.3 


Electrostatic Potential of a Ring of Charge As a further example, con¬ 
sider the electrostatic potential produced by a conducting ring carrying a total 
electric charge q (Fig. 11.9). From electrostatics (and Section 1.14) the po¬ 
tential i Jr satisfies Laplace’s equation. Separating variables in spherical polar 
coordinates, we obtain 


^ a n 

f(r, O') = 22 c n -^P n (cos9), r> a, (11.62a) 

n =0 r 

where a is the radius of the ring that is assumed to be in the 6 = tx/2 plane. 
There is no <p (azimuthal) dependence because of the cylindrical symmetry of 
the system. The terms with positive exponent in the radial dependence have 
been rejected since the potential must have an asymptotic behavior 

\Jr --— • r » a. (11.62b) 

47 T £o r 

The problem is to determine the coefficients c n in Eq. (11.62a). This may be 
done by evaluating \j/ (r, 0 ) at 0 = 0, r = z, and comparing with an independent 


Figure 11.9 

Charged, Conducting 
Ring 
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calculation of the potential from Coulomb’s law. In effect, we are using a 
boundary condition along the 2-axis. From Coulomb’s law ( with all charge 
equidistant), 


Vr(r, 0) = 


q i 

47T£o ( z 2 + a 2 yt 2 ’ 

4tt s Q zj^ 1 2 2s (s!) 2 


0 = 0 
r = z, 

a\ 2s 

- , z > a. 

z 


(11.62c) 


The last step uses the result of Exercise 10.1.15. Now, Eq. (11.62a) evaluated 
at 0 = 0, r = z [with P„(l) = 1], yields 


\[r(r, 9 ) = 


E 


n=0 


yll+\ ’ 


r — z. 


(11.62d) 


Comparing Eqs. (11.62c) and (11.62d), we get c n — 0 for n odd. Setting n = 2s, 
we have 


C2s = 


q 

47T£o 


(-1) S 


Cgg)! 

2 2 s (s!) 2 ’ 


(11.62e) 


and our electrostatic potential i j/(r, 0) is given by 


4jT£ 0 rE ( 1)S 2 2^( S !)2 


a\ 
r ) 


2 s 


P 2s (cos0), r > a. 


(11.62f) 


EXERCISES 

11.3.1 You have constructed a set of orthogonal functions by the Gram- 
Schmidt process (Section 9.3), taking u m {x) = x n , n = 0, 1, 2,..., in 
increasing order with iv(x) = 1 and an interval —1 < x < 1. Prove 
that the nth such function constructed is proportional to i' n (x). 

Hint. Use mathematical induction. 

11.3.2 Expand the Dirac delta function in a series of Legendre polynomials 
using the interval —1 < x < 1. 

11.3.3 Verify the Dirac delta function expansions 

5(1 - x') = ^ 2n + 1 P n (x), 

n =0 ^ 

°° 2 . 7 ) 1 

5(1 + x) = — P«(x). 

These expressions appear in a resolution of the Rayleigh plane wave 
expansion (Exercise 11.4.7) into incoming and outgoing spherical 


waves. 
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Note. Assume that the entire Dirac delta function is covered when 
integrating over [—1, 1], 

11.3.4 Neutrons (mass 1) are being scattered by a nucleus of mass A(A > 1). 
In the center of the mass system the scattering is isotropic. Then, in 
the lab system the average of the cosine of the angle of deflection of 
the neutron is 



Show, by expansion of the denominator, that (cos i jr) = 2/3 A. 

11.3.5 A particular function f(x) defined over the interval [—1, 1 ] is expanded 
in a Legendre series over this same interval. Show that the expansion 
is unique. 


11.3.6 A function f(x) is expanded in a Legendre series f(x) — 
L“=0 o-nPnipc ). Show that 



This is the Legendre form of the Fourier series Parseval identity 
(Exercise 14.4.2). It also illustrates Bessel’s inequality [Eq. (9.73)] 
becoming an equality for a complete set. 

11.3.7 Derive the recurrence relation 


(1 - x 2 )P^(x~) = nP n -i(x) - nxP n (x ) 


from the Legendre polynomial generating function. 

11.3.8 Evaluate [J P n (x) dx. 


ANS. n = 2s; 1 for s = 0, 0 for s > 0, 

n=2s + 1; P 2s (0)/(2s + 2) = (-l) s (2s - l)!!/l(2s + 2)!! 


Hint. Use a recurrence relation to replace P„(x) by derivatives and 
then integrate by inspection. Alternatively, you can integrate the gen¬ 
erating function. 


11.3.9 (a) For 


Rx) = 


+ 1, 0 < x < 1 

— 1, — 1 < x < 0, 


show that 



(b) By testing the series, prove that the series is convergent. 
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if m< n. 

if m> n. 


Y OO 

/(0) = - 1 + 1) exp[i<5(] sin <5/P/(cos 0), 

;=o 

where 6 is the angle of scattering, l is the angular momentum, hk is 
the incident momentum, and Si is the phase shift produced by the 
central potential that is doing the scattering. The total cross section 
is CT to t = / \f(0)\ z d£2. Show that 

Otot = z2& 1 + i) 55111 " S l- 

K 1=0 

11 . 3.12 The coincidence counting rate, W(9), in a gamma-gamma angular 
correlation experiment has the form 

OO 

w ( 0 ) = Z-Mcosoy 

n =0 

Show that data in the range 7r/2 < 9 < it can, in principle, define the 
function fT(d) (and permit a determination of the coefficients din)- 
This means that although data in the range 0 <6 < tt/2 may be useful 
as a check, they are not essential. 

11 . 3.13 A conducting sphere of radius r 0 is placed in an initially uniform 
electric field, Eo. Show the following: 

(a) The induced surface charge density is 

a = 3soE 0 cos 0. 

(b) The induced electric dipole moment is 

P = AjtTqSoEo. 

The induced electric dipole moment can be calculated either from 
the surface charge [part (a)] or by noting that the final electric 
field E is the result of superimposing a dipole field on the original 
uniform field. 

11 . 3.14 A charge q is displaced a distance a along the 2 -axis from the center 
of a spherical cavity of radius R. 

(a) Show that the electric field averaged over the volume a <r < R 


11.3.10 Prove that 

-i 

x(l 


/: 


■ x2 ) P h P 'm dx 


— 0, unless m = n ± 1, 

2 n(n 2 — 1) t 
= 4*?-l 8m ’ n ~ h 
2n(n+ 2)(n+ 1) 


(2n+ l)(2*+3) 

11.3.11 The amplitude of a scattered wave is given by 


*m,n+ 1? 


is zero. 
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(b) Show that the electric field averaged over the volume 0 < r < a is 

E = z E z — —z-- (SI units) = —z-, 

47r eoa 2 3eo 

where n is the number of such displaced charges per unit volume. This 
is a basic calculation in the polarization of a dielectric. 

Hint. E = — V<p. 

11.3.15 Determine the electrostatic potential (Legendre expansion) of a 
circular ring of electric charge for r < a. 

11.3.16 Calculate the electric field produced by the charged conducting ring 
of Exercise 11.3.15 for 

(a) r > a, (b) r < a. 


11.3.17 Find the potential i//(r, 0) produced by a charged conducting disk 
(Fig. 11.10) for r > a, the radius of the disk. The charge density a 
(on each side of the disk) is 


ct(p) = 


47ra(a 2 — p 2 ) 1 / 2 ’ 


p 2 = x 2 + y 1 . 


Hint. The definite integral you get can be evaluated as a beta function. 
For more details, see Section 5.03 of Smythe in Additional Reading. 

a 00 i / a \ 21 


Figure 11.10 

Charged, Conducting 
Disk 



11.3.18 From the result of Exercise 11.3.17 calculate the potential of the disk. 
Since you are violating the condition r > a, justify your calculation 
carefully. 

Hint. You may run into the hypogeometric series given in Exercise 
5.2.9. 

11.3.19 The hemisphere defined by r = a, 0 < 0 < tt/ 2 has an electrostatic po¬ 
tential +Vq. The hemisphere r = a, n/2 < 0 < 7t has an electrostatic 
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potential — Vo- Show that the potential at interior points is 


V = 


V 0 £ 


71=0 


4n+ 3 
2n + 2 


/ r\ 2,1+1 

( - J P2n(0)P2n+l (COS 0) 


= Vb£(-1)’ 


n=0 


t (4re + 3)(2 tc — 1)!! 
(2n + 2)!! 


/ r x2«+i 

(-J P‘2n+ 1 (COS 0 ). 


Hint. You need Exercise 11.3.8. 

11.3.20 A conducting sphere of radius a is divided into two electrically sep¬ 
arate hemispheres by a thin insulating barrier at its equator. The top 
hemisphere is maintained at a potential V 0 and the bottom hemisphere 
at — V 0 . 

(a) Show that the electrostatic potential exterior to the two hemi¬ 
spheres is 

00 (2s — IT' /a\ 2s+2 

V( r > e) = Vo XX-l) s (4. S + 3)^-^ (^-J P 2s+ i(cos0). 

(b) Calculate the electric charge density a on the outside surface. Note 
that your series diverges at cos 0 = ± 1 as you expect from the in¬ 
finite capacitance of this system (zero thickness for the insulating 
barrier). 


ANS. a = 8 0 E n = -so — 
dr 

r=a 

°° (2s — IV’ 

= SO Vo E^- 1 ^ 45 + 3) • p 2s+ i(cos0). 

s=0 ^ST- 

11. 3. 21 In the notation of Section 9.4 {x\cp s ) = sj(2s + 1)/2 P s (x) , a Legendre 
polynomial is renormalized to unity. Explain how \<p s ){<p s \ acts as a 
projection operator. In particular, show that if | f) = Y.n a n\ ( Pn), then 

\<Ps)(<Ps\f) = a' s \Vs)- 

11.3.22 Expand r 8 asa Legendre series. Determine the Legendre coefficients 
from Eq. (11.50), 




2 m+ 1 


L 


x 8 P m (x) dx. 


Check your values against AMS-55, Table 22.9. This illustrates the ex¬ 
pansion of a simple function. Actually, if / (x) is expressed as a power 
series, the recursion Eq. (11.26) is both faster and more accurate. 
Hint. Gaussian quadrature can be used to evaluate the integral. 
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11.3.23 Expand arcsine in Legendre polynomials. 

11.3.24 Expand the polynomials 2 + 5a?, 1 +a?+a? 3 in aLegendrc series and plot 
your results and the polynomials for the larger interval — 2 < a? < 2. 


0 


11.4 Alternate Definitions of Legendre Polynomials 


Rodrigues’s Formula 


The series form of the Legendre polynomials [Eq. (11.18)] of Section 11.1 may 
be transformed as follows. From Eq. (11.18) 


[m/2] 




r =0 


2 n r\(n— 2 r)\ 


(11.63) 


For n an integer 


[m/2] 


P n (x) = £(- If 


r =0 


d 

2 n r\(n — r)\ \dx 


x 


2n—2r 


1 ( d \ l Y" 

9 \ rl/r I 


(-1 Tn\ 2n _ 2r 

. Jb 


2 n n\ \dxJ “r!(n — r)! 


(11.64) 


Note the extension of the upper limit. The reader is asked to show in Exer¬ 
cise 11.4.1 that the additional terms [n/ 2] + 1 to n in the summation contribute 
nothing. However, the effect of these extra terms is to permit the replacement 
of the new summation by (x 2 — 1)" (binomial theorem once again) to obtain 

( " 2 - i)re (iL65) 

This is Rodrigues’s fonnula. It is useful in proving many of the properties of 
the Legendre polynomials, such as orthogonality. A related application is seen 
in Exercise 11.4.3. The Rodrigues definition can be extended to define the 
associated Legendre functions. 


EXAMPLE 11.4.1 


Lowest Legendre Polynomials For n— 0, = 1 follows right away from 

Eq. (11.65), as well as I\(x) = ^ = x. For n — 2 we obtain 


Id 2 . , 1 9 3 9 1 

p 20*0 = - 2x“ + 1) = -(12a; 2 - 4) = -x 2 - 

8 dx z 8 2 2 

and for n — 3 

ld 3 K4 9 1 Q 5n3 

P:>,{x) = — —r(a; 6 - 3a? 4 + 3a; 2 - 1) = —(120a? 3 - 72a;) = -x - -x, 
48 dx 6 48 2 2 


in agreement with Table 11.1. 
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EXERCISES 


11.4.1 Show that each term in the summation 


n 


E 

r=[m/2]+l 


d \ n (—1 ym x2n _ 2r 

dx J r\(n — r)\ 


vanishes (r and n integral). 

11.4.2 Using Rodrigues’s fonnula, show that the P„ (x) are orthogonal and that 

fjp„(x-)fd X = UU. 

Hint. Use Rodrigues’s formula and integrate by parts. 

11.4.3 Show that /J j x m P n (x)dx — 0 when m< n. 

Hint. Use Rodrigues’s formula or expand x m in Legendre polynomials. 


11.4.4 Show that 



x n P n {x)dx = 


2 n+l n\n\ 
(2n+ 1)! 


Note. You are expected to use Rodrigues’s formula and integrate by 
parts, but also see if you can get the result from Eq. (11.18) by inspec¬ 
tion. 


11.4.5 Show that 


L 


2 Smf, (2r)!(r + n!) 


r > n. 


/_! (2r + 2n+ l)!(r — ri)\ 

11.4.6 As a generalization of Exercises 11.4.4 and 11.4.5, show that the 
Legendre expansions of X s are 

s = 2r> 

“ (2 r + 2 n+ l)!(r — n)\ 

^ = f. 2^-(4n + 3 X 2r + l)!(r + M+ l) ! 

^ J ^ (2r + 2n+ 3)!(r — ri)\ + ^ 

s = 2r + 1. 


11.4.7 A plane wave may be expanded in a series of spherical waves by the 
Rayleigh equation 

OO 

e ikrcosy = ^2 an j n (kr)P n (c 0 S Y ). 
n =0 

Show that a n = i"(2n + 1). 

Hint. 

1. Use the orthogonality of the P n to solve for a n j n (kr ). 

2. Differentiate n times with respect to (kr) and set r = 0 to eliminate 
the r dependence. 

3. Evaluate the remaining integral by Exercise 11.4.4. 
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Note. This problem may also be treated by noting that both sides of 
the equation satisfy the Helmholtz equation. The equality can be estab¬ 
lished by showing that the solutions have the same behavior at the 
origin and also behave alike at large distances. 

11 . 4.8 Verify the Rayleigh equation of Exercise 11.4.7 by starting with the 
following steps: 

(a) Differentiate with respect to (kr) to establish 

^2 a n j' n Qcr)P n {cos y) = a n jn(kr ) cos yP,,(cos y). 

n n 

(b) Use a recurrence relation to replace cos y P„ (cos y) by a linear com¬ 
bination of P„_ i and P), \ i • 

(c) Use a recurrence relation to replace j' n by a linear combination of 
j n -1 and j n+ 1 . See Chapter 12 for Bessel functions. 

11 . 4.9 In numerical work (Gauss-Legendre quadrature) it is useful to establish 
that P n (x ) has n real zeros in the interior of [—1, 1]. Show that this is 
so. 

Hint. Rolle’s theorem shows that the first derivative of ( x 2 — l) 2 ” has 
one zero in the interior of [—1, 1]. Extend this argument to the second, 
third, and ultimately to the nth derivative. 


11.5 Associated Legendre Functions 


When Laplace’s equation is separated in spherical polar coordinates 
(Section 8.9), one of the separated ODEs is the associated Legendre equation 


1 d 
sin 0 d6 


sinf 


f dP™( cos 0) 

d6 


n(n + 1) — 


m 

sin 2 0 


P™( cos 6>) = 0. (11.66) 


With x — cos 0, this becomes 

(1-x 2 ) 


2 ^ P ™ ix) ~ 2X dx P ™ (x) + 


n(n + 1) — 


TO“ 


1 — x 2 


Pl%x) = 0. (11.67) 


If the azimuthal separation constant m 2 = 0 we have Legendre’s equation, 
Eq. (11.36). The regular solutions (with to not necessarily zero), relabeled 
P™'(x), are 

Pn 0*0 = (1 - X 2 T ,2 -^Pn(xl (H.68) 

These are the associated Legendre functions. 14 Since the highest power of x 
in P n (x) is x n , we must have m < n (or the TO-fold differentiation will drive 
our function to zero). In quantum mechanics the requirement that to < n has 


14 One finds (as in AMS-55) the associated Legendre functions defined with an additional factor of 
(— 1)™. This phase (— 1 ) m seems an unnecessary complication at this point. It will be included in 
the definition of the spherical harmonics F M ™(0, tp~). Note also that the upper index mis not an 

exponent. 
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the physical interpretation that the expectation value of the square of the 
s-component of the angular momentum is less than or equal to the expectation 
value of the square of the angular momentum vector L (Section 4.3), 

(L'z') < (L ) = j \lr nm Li \l/ nm d r, 

where mis the eigenvalue of L z , and n(n + 1) is the eigenvalue of L 2 . From the 
form of Eq. (11.68), we might expect to to be nonnegative. However, if P n (x) 
is expressed by Rodrigues’s formula, this limitation on m is relaxed and we 
may have — n < m < n, with negative as well as positive values of to being 
permitted. Using Leibniz’s differentiation formula once again, the reader may 
show that P™(x) and P~ m (x) are related by 

P n m (x) = (11-69) 

From our definition of the associated Legendre functions, P™(pc), 

Pn(x) = P n (x). (11.70) 


As with the Legendre polynomials, a generating function for the associated 
Legendre functions is obtained via Eq. (11.67) from that of ordinary Legendre 
polynomials: 


( 2 to )!(1 - x 2 T 12 
2™m!(l — 2 tx+ t 2 ) m +V 2 


OO 


J2 p :um s - 


(11.71) 


If we drop the factor (1 — .x 2 )'"*/ 2 = sin " 1 9 from this formula and define the 
polynomials V™ +m (x) = P™ m (x)(l — x 2 )~ m/2 , then we obtain a practical form 
of the generating function 


, (2 to )! 

9mKpC ’ J _ 2™to!(1 - 2 tx+t 2 ) m + 1/2 


OO 


T, v ^y- 


(11.72) 


We can derive a recursion relation for associated Legendre polynomials 
that is analogous to Eqs. (11.23) and (11.26) by differentiation as follows: 

(1 - 2 tx+ f 2 )%^ = (2m+ l)0r - t)g m (x, t). 
at 

Substituting the defining expansions for associated Legendre polynomials we 
get 

(1 - 2 tx + t 2 ) Y, = (2™ + 1) E [ xV T + J S ~ KxJ S+1 ] ■ 


Comparing coefficients of powers of t in these power series, we obtain the 
recurrence relation 


(S + IWT+m+i - (2to+ 1 + 2 s)xV™ m + (s + 2TO)P™ m _j = 0. (11.73) 


For to = 0 and s — n this relation is Eq. (11.26). 






11.5 Associated Legendre Functions 


583 


Before we can use this relation we need to initialize it, that is, relate the 
associated Legendre polynomials with to = 1 to ordinary Legendre poly¬ 
nomials. We observe that 

(1 - 2 xt + t 2 )gi(x, t) = (1 - 2 xt + t 2 y l/2 = J2 Ps(x)t s (11.74) 

S 

so that upon inserting Eq. (11.72) we get the recursion 

•P'+i - 2 xVl + Vl_ ! = P s (x). (11.75) 

More generally, we also have the identity 

(1 - 2 xt + t 2 )g m+1 (x, t) = (2m + l)g m (x, t), (11.76) 

from which we extract the recursion 

n+ + ™ + i - 2xV lZl + K:L i = (2m+ l)n+ m W, (11.77) 

which relates the associated Legendre polynomials with superindex m+ 1 to 
those with to. For m = 0 we recover the initial recursion Eq. (11.75). 


EXAMPLE 11.5.1 


Lowest Associated Legendre Polynomials Now we are ready to derive 
the entries of Table 11.2. For m = 1 and s = 0 Eq. (11.75) yields V\ = 1 
because Vq = 0 = 'P ] , do not occur in the definition, Eq. (11.72), of the 
associated Legendre polynomials. Multiplying by (1 — .x; 2 ) 1/2 = sin 9 we get the 
first line of Table 11.2. For s = 1 we find from Eq. (11.75), 


Vl(x) = P\ + 2xV\ = x + 2x — 3x, 

from which the second line of Table 11.2, 3 cos 9 sin 0, follows upon multi¬ 
plying by sin 9. For s = 2 we get 

Vl(x) = P 2 + 2xVl ~ = l@x 2 - 1) + 6a; 2 - 1 = ^ x 2 - 

Lt Lt Li 


Table 11.2 

Associated Legendre 
Functions 


PlQx) = (1 — x 2 ) 1/2 = sin 8 
Po(x) = 3x(l — x 2 ) 1/2 = 3cos6 sin 9 
Pf(x) = 3(1 - x 2 ) = 3 sin 2 6 

P) (x) = |(5x 2 - 1)(1 - x 2 ) 1/2 = |(5cos 2 8 - l)sin@ 

P|(x) = 15x(l — x 2 ) =15 cos 8 sin 2 8 
P|(x) = 15(1 - x 2 ) 3/2 = 15 sin 3 8 

P|(x) = |(7x 3 - 3x)(l — x 2 ) 1/2 = | (7 cos 3 8 — 3 cos 8) sin 8 
P|(x)= f(7x 2 -l)(l-x 2 )= f (7cos 2 @ - l)sin 2 6» 

P|(x) = 105x(l — x 2 ) 3/2 = 105 cos 8 sin 3 8 
P 4 4 (x) = 105(1 - x 2 ) 2 = 105 sin 4 8 
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in agreement with line 4 of Table 11.2. To get line 3 we use Eq. (11.76). For 
m = 1, s = 0, this gives 77](.r) = ?>P\ (x) = 3, and multiplying by l—ir 2 = sin 2 0 
reproduces line 3 of Table 11.2. For lines 5, 8, and 9, Eq. (11.72) may be used, 
which we leave as an exercise. ■ 


EXAMPLE 11.5.2 


Special Values For x = 1 we use 

OO 

(i-2 1 + t 2 y m ~ 1/2 = (i - ty 2m ~ l = J 2 

in Eq. (11.72) and find 

V] 


s =o 


— 2 TO — 1 


s+mX 1 ) 2mm , 


(2m)! ( —2m— 1 
s 


(11.78) 


For m = 1, s = 0 we have V\(Y) = ( 0 3 ) = 1; for s = 1, 77] (1) = — ( i 3 ) = 3; 
and for s = 2, 77] (1) = (77) = (~ 3 X~ 4 ) = 6 = |(5 — 1). These all agree with 
Table 11.2. 

For x = 0 we can also use the binomial expansion, which we leave as an 
exercise. ■ 


EXAMPLE 11.5.3 


Parity From the identity g m {—x, —t) = g m (x, t) we obtain the parity relation 


K+m (-*) = (-1 YK+Jx)- (11-79) 


We have the orthogonality integral 

f P”'(x)P" l (x)dx= 2 ■ + S (11.80) 

J- i 1 2q + 1 (g - m)\ 

or, in spherical polar coordinates, 

P 71 9 (n -L Wl)) 

/ P™(cos0)P™(cos0)sin0 dO = —— ■ f -71 S (11.81) 

Jo 2q + 1 (q — m)! 

The orthogonality of the Legendre polynomials is a special case of this 
result, obtained by setting mequal to zero; that is, form = 0, Eq. (11.80) reduces 
to Eqs. (11.47) and (11.48). InbothEqs. (11.80) and (11.81) our Sturm-Liouville 
theory of Chapter 9 could provide the Kronecker delta. A special calculation 
is required for the normalization constant. 


Spherical Harmonics 


The functions T m (<p) = e vmv ‘ are orthogonal when integrated over the azimuthal 
angle <p, whereas the functions P™(cos 0) are orthogonal upon integrating over 
the polar angle 6. We take the product of the two and define 


Y r ; n (e, <p) = (-1)" 


]2n+ 1 (n— rri)\ 
47T (TO+m)! 


p”\cos oy imxp 


(11.82) 
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Table 11.3 

Spherical Harmonics 
(Condon-Shortley Phase) 



to obtain functions of two angles (and two indices) that are orthonormal over 
the spherical surface. These Y™{9, <p) are spherical harmonics. The complete 
orthogonality integral becomes 

/» 2i7T /» 7T 

/ / <p-)Y™xe, <p) sin Odd dcp = ^711712^71117712 (11.83) 

J (p =0 J 0=0 

and explains the presence of the complicated normalization constant in 
Eq. (11.82). 

The extra (—1)™ included in the defining equation of Y™(6, <p) with — n < 
m < n deserves some comment. It is clearly legitimate since Eq. (11.68) is 
linear and homogeneous. It is not necessary, but in moving on to certain quan¬ 
tum mechanical calculations, particularly in the quantum theory of angular 
momentum, it is most convenient. The factor (— l) m is a phase factor often 
called the Condon-Shortley phase after the authors of a classic text on atomic 
spectroscopy. The effect of this (—l) m [Eq. (11.82)] and the (—l) m ofEq. (11.69) 
for P~ m (cos 0) is to introduce an alternation of sign among the positive m 
spherical harmonics. This is shown in Table 11.3. 

The functions Y'"(Q , <p) acquired the name spherical harmonics because 
they are defined over the surface of a sphere with 9 the polar angle and <p the 
azimuth. The “harmonic” was included because solutions of Laplace’s equation 
were called harmonic functions and F™(cos, (p) is the angular part of such a 
solution. 


EXAMPLE 11.5.4 


Lowest Spherical Harmonics For n — 0 we have m = 0 and F 0 ° = 

from Eq. (11.82). For n — 1 we have m = ±1, 0 and F® = cosO, whereas 
for m — ± 1 we see from Table 11.2 that cos 9 is replaced by sin 9 and we have 
the additional factor (=p 1) , which checks with Table 11.3. ■ 
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EXAMPLE 11.5.5 


SUMMARY 


In the framework of quantum mechanics Eq. (11.67) becomes an orbital 
angular momentum equation and the solution (9, <p) (n replaced by L and 
mby M) is an angular momentum eigenfunction, with L being the angular mo¬ 
mentum quantum number and M the 2 -axis proj ection of L . These relationships 
are developed in more detail in Section 4.3. 


Spherical Symmetry of Probability Density of Atomic States What is 
the angular dependence of the probability density of the degenerate atomic 
states with principal quantum number n — 2? 

Here we have to sum the absolute square of the wave functions for n = 2 
and orbital angular momentum 1 = 0, m = 0; 1 = 1, m = —1, 0, +1; that is, 
s and three p states. We ignore the radial dependence. The s state has orbital 
angular momentum l = 0 and m = 0 and is independent of angles. For the p 
states with l = 1 and m = ±1, 0 


y^oo 


yu _ 
*0 — 


V5r 




we have to evaluate the sum 


l 


E I 7 ™! 2 

m=— 1 


3 

4 n 



+ cos 2 6 


1 


upon substituting the spherical harmonics from Table 11.3. This result is 
spherically symmetric, as is the density for the s state alone or the sum of 
the three p states. These results can be generalized to higher orbital angular 
momentum l. ■ 


Legendre polynomials are naturally defined by their generating function in a 
multipole expansion of the Coulomb potential. They appear in physical sys¬ 
tems with azimuthal symmetry. They also arise in the separation of partial 
differential equations with spherical or cylindrical symmetry or as orthogonal 
eigenfunctions of the Sturm-Liouville theory of their second-order differen¬ 
tial equation. Associated Legendre polynomials appear as ingredients of the 
spherical harmonics in situations that lack in azimuthal symmetry. 


EXERCISES 

11.5.1 Show that the parity of Yj w (0, ip) is (—l) 7 '. Note the disappearance of 
any M dependence. 

Hint. For the parity operation in spherical polar coordinates, see 
Section 2.5 and Section 11.2. 

11.5.2 Prove that 

rf ( o,rt=( 2 L±iy'%„. 
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11.5.3 In the theory of Coulomb excitation of nuclei we encounter 
Y^(n/2, 0). Show that 

Y m( 1 0 \ = (^+l\ 1/2 [(L-M)!(L + M)!]V 2 /2 

L \ 2 ’ / V 4tt ) (L — M)\\(L + M)\\ V ’ 
for L + M even, 

= 0 for L + M odd. 

Here, 

(2to)!! = 2n(2n — 2) • • ■ 6 ■ 4 • 2, 

(2 n+ 1)!! = (2 n+ 1)(2 n- 1) ■ • • 5 • 3 ■ 1. 

11.5.4 (a) Express the elements of the quadrupole moment tensor XjXj as a 

linear combination of the spherical harmonics Y.J" (and K 0 °). 

Note. The tensor XjXj is reducible. The 7 0 ° indicates the presence 
of a scalar component. 

(b) The quadrupole moment tensor is usually defined as 

Qij = J (3 XiXj - r 2 Sij)p( r) dr, 

with p (r) the charge density. Express the components of 
(3 XiXj — r 2 8ij ) in terms of r 2 Y.f. 

(c) What is the significance of the —r 2 <5y term? 

Hint. Contract the indices i, j. 

11.5.5 The orthogonal azimuthal functions yield a useful representation of 
the Dirac delta function. Show that 

^ OO 

Kn -<P2)=2~ ex P[ im Oi - n)\- 

L m=—oo 

11.5.6 Derive the spherical harmonic closure relation 

oo +1 i 

V T Y, m (0u <Pi)Yr*(e 2 , (Pa) = ——m - e 2 Mvi - %) 
Urt'-i sm0 i 

= 5(cos0! — COS0 2 )<5(y>l — V2)- 

11.5.7 The quantum mechanical angular momentum operators L x ± iL y are 
given by 

L x + iL v — e lip (— + i cotd— ), 
x v \d0 d <pj’ 

L x — iL v — —e~ lip (—— icotO— 1 

v \d6 dcp J 

Show that 

(a) (L x + iLy)Y*?(e, <p), = V(L-MXL + Af+l)7f +1 (0, <p), 

(b) (L x - iLy)Y?(P, <P), = V( L-M)(L-M+Y)Y*~\0, <p). 
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11.5.8 With L± given by 


L± = L x ± iL y = ±e ±i<p 


show that 

(a) Yr = . 

(b) I T = 


(l + m)! 
(2l)\(l - to)! 
(I — m)\ 


(2 iy.Q + my. 


(L + y +m Y- 1 . 


3 

W 


± i cot 6 — 
dcp 


11.5.9 In some circumstances it is desirable to replace the imaginary ex¬ 
ponential of our spherical harmonic by sine or cosine. Morse and 
Feshbach define 


Y mn = P™(COS 6) COS m<p, 

Km = Pn (cos6)sinmcp, 

where 

l l °g. Of sin o iD if = 2i *’ + 

= 47r for n— 0 (Fg 0 is undefined). 

These spherical harmonics are often named according to the patterns 
of their positive and negative regions on the surface of a sphere zonal 
harmonics for m = 0, sectoral harmonics for m = n, and tesseral 
harmonics for 0 < m < n. For Y' mn , n — 4, m = 0, 2, 4, indicate on a 
diagram of a hemisphere (one diagram for each spherical harmonic) 
the regions in which the spherical harmonic is positive. 

11.5.10 A function f(r, 6, <p) may be expressed as a Laplace series 

f(r,e, <p) = J2 a iy Y i m ( 0 , <p)- 

With () sphere used to mean the average over a sphere (centered on the 
origin), show that 

if(T> ^))sphere = f(®, 0, 0). 


Additional Reading 


Hobson, E. W. (1955). The Theory of Spherical and Ellipsoidal Harmonics. 
Chelsea, New York. This is a very complete reference, which is the classic 
text on Legendre polynomials and all related functions. 

Srnythe, W. R. (1989). Static and Dynamic Electricity, 3rd ed. McGraw-Hill, 
New York. 

See also the references listed in Section 4.4 and at the end of Chapter 13. 












Chapter 12 



Bessel Functions 


12.1 Bessel Functions of the First Kind, Ji,(jr) 


Bessel functions appear in a wide variety of physical problems. When one an¬ 
alyzes the sound vibrations of a drum, the partial differential wave equation 
(PDE) is solved in cylindrical coordinates. By separating the radial and angu¬ 
lar variables, R(r)e imp , one is led to the Bessel ordinary differential equation 
(ODE) for R(r) involving the integer n as a parameter (see Example 12.1.4). 
The Wentzel-Kramers-Brioullin (WKB) approximation in quantum mechanics 
involves Bessel functions. A spherically symmetric square well potential in 
quantum mechanics is solved by spherical Bessel functions. Also, the extrac¬ 
tion of phase shifts from atomic and nuclear scattering data requires spherical 
Bessel functions. In Sections 8.5 and 8.6 series solutions to Bessel’s equation 
were developed. In Section 8.9 we have seen that the Laplace equation in cylin¬ 
drical coordinates also leads to a form of Bessel’s equation. Bessel functions 
also appear in integral form—integral representations. This may result from 
integral transforms (Chapter 15). 

Bessel functions and closely related functions form a rich area of mathe¬ 
matical analysis with many representations, many interesting and useful prop¬ 
erties, and many interrelations. Some of the major interrelations are developed 
in Section 12.1 and in succeeding sections. Note that Bessel functions are not 
restricted to Chapter 12. The asymptotic forms are developed in Section 7.3 as 
well as in Section 12.3, and the series solutions are discussed in Sections 8.5 
and 8.6. 
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Biographical Data 

Bessel, Friedrich Wilhelm. Bessel, a German astronomer, was bom in 
Minden, Prussia, in 1784 and died in Konigsberg, Prussia (now Russia) in 
1846. At the age of 20, he recalculated the orbit of Halley’s comet, impressing 
the well-known astronomer Olbers sufficiently to support him in 1806 for a 
post at an observatory. There he developed the functions named after him 
in refinements of astronomical calculations. The first parallax measurement 
of a star, 61 Cygni about 6 light-years away from Earth, due to him in 1838, 
proved definitively that Earth was moving in accord with Copemican theory. 
His calculations of irregularities in the orbit of Uranus paved the way for 
the later discovery of Neptune by Leverrier and J. C. Adams, a triumph of 
Newton’s theory of gravity. 


Generating Function for Integral Order 


Although Bessel functions J v (x ) are of interest primarily as solutions of 
Bessel’s differential equation, Eq. (8.62), 


x2 ~r~¥ + x ~t~ + O^ 2 ~ v2 Vv = °> 
axax 

it is instructive and convenient to develop them from a generating function, just 
as for Legendre polynomials in Chapter 11. 1 This approach has the advantages 
of finding recurrence relations, special values, and normalization integrals and 
focusing on the functions themselves rather than on the differential equation 
they satisfy. Since there is no physical application that provides the generating 
function in closed form, such as the electrostatic potential for Legendre poly¬ 
nomials in Chapter 11, we have to find it from a suitable differential equation. 

We therefore start by deriving from Bessel’s series [Eq. (8.70)] for integer 
index v = n, 


Jn(x) = £ 


(-l) s (x\ 


s! (s + ri)\ 


2 s+n 


i 


that converges absolutely for all x, the recursion relation 

i x ~ n Jnm = -X- n Jn+l (a). (12.1) 

ax 

This can also be written as 

71 

-J„(x) - *4+10*0 = j'„(x). (12.2) 

x 

To show this, we replace the summation index s -> s — 1 in the Bessel function 
series [Eq. (8.70)] for J n+ \(x), 


- 4 + 10*0 = Y 


(-1 ) s 


s =0 


s!(s + n+ 1)! 


2s+n+l 


(12.3) 


Generating functions were also used in Chapter 5. In Section 5.6, the generating function (1 +x)“ 
defines the binomial coefficients; x/(e x — 1) generates the Bernoulli numbers in the same sense. 
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in order to change the denominator (s + n+ 1)! to (s + n)\. Thus, we obtain the 
series 


r , , If (—l) s 2s 

J n+l (x) - > 


x 7=o s '-( s + n ) ! 


x\ 


n+2s 


(12.4) 


which is almost the series for J„(x), except for the factor s. If we divide by x n 
and differentiate, this factor s is produced so that we get from Eq. (12.4) 


x~ n J n+l (x ) = 



(~l) s 

s!(s + m)! 



d 

dx 


[x n J„(oc)\, 


(12.5) 


that is, Eq. (12.1). 

A similar argument for J n -i, with summation index s replaced first by s — n 
and then by s -»• s + 1, yields 

x n J n m = x n Jn- i(aO, (12.6) 

dx 

which can be written as 

71 

Jn- 1 (*) - - J n (x) = J’„(X). (12.7) 

X 

Eliminating J' n from Eqs. (12.2) and (12.7), we obtain the recurrence 

2 Yt 

J n —\(%) Jn-\-i O) = (12.8) 

x 

which we substitute into the generating series 

oo 

g(x, t) = Jn(x)t n . (12.9) 

n=—o o 


This gives the ODE in t (with x a parameter) 

00 / 1\ 2t da 

Y\ t n (J n -\(x) + J n+ i(ar)) = ( t + - ) g(x, t) = — —. 

nt^oo V tj X 31 

Writing it as 


and integrating we get 


gdt 2 V t 2 /’ 


lnflr = | i 1 ~ \ ) +lnc> 


which, when exponentiated, leads to 

g(x, t) = e (x/2)(t - x/t ^c(x), 


( 12 . 10 ) 


( 12 . 11 ) 


( 12 . 12 ) 


where c is the integration constant that may depend on the parameter x. Now 
taking x = 0 and using J n ( 0) = S n o (from Example 12.1.1) in Eq. (12.9) 
gives g(0, t) = 1 and c(0) = 1. To determine c(x) for all x, we expand the 
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Figure 12.1 

Bessel Functions, Jo (.v), 
and Jz(x) 



exponential in Eq. (12.12). The 1/f term leads to a Laurent series (see Sec¬ 
tion 6.5). Incidentally, we understand why the summation in Eq. (12.9) has to 
run over negative integers as well. So we have a product of Maclaurin series 
in oct /2 and —x/2t, 


e xt/2 m e -x/2t 


x\ s tr s 


-±(dW ir ^v«- 


For a given s we get t n (n > 0) from r = n+ s 


2j (n+sy. 
The coefficient of t n is then 2 

(-l) s fx\ n+2s 
s!(n+s)! \2/ 


t n+s ( iy ( x X r , 


s' 


J n (x) = 


X 


X 


n +2 


2 n n\ 2 n+2 (n+ 1)! 


(12.13) 


(12.14) 


(12.15) 


so that c(.x;) = 1 for all r in Eq. (12.12) by comparing the coefficient of t°, Jo Or), 
with Eq. (12.3) for n = — 1. Thus, the generating function is 


OO 

g(x, t) = = Jn(x)t n . (12.16) 

n=—oo 


This series form, Eq. (12.15), exhibits the behavior of the Bessel function JJ.r) 
for all x, converging everywhere, and permits numerical evaluation of J n (x). 
The results for J 0 , Ji, and J 2 are shown in Fig. 12.1. From Section 5.3, the error 
in using only a finite number of terms in numerical evaluation is less than the 
first term omitted. For instance, if we want J„ (x) to ±1% accuracy, the first 
term alone of Eq. (12.15) will suffice, provided the ratio of the second term 
to the first is less than 1% (in magnitude) or x < 0.2 (n + 1) 1/2 . The Bessel 


2 From the steps leading to this series and from its absolute convergence properties it should be 
clear that this series converges absolutely, with x replaced by z and with z any point in the finite 
complex plane. 
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functions oscillate but are not periodic; however, in the limit as x —> oc the 
zeros become equidistant (Section 12.3). The amplitude of J n (x) is not constant 
but decreases asymptotically as x~ l/2 . [See Eq. (12.106) for this envelope.] 
Equation (12.15) actually holds for to < 0, also giving 


J. n (x) = £ 


(~1) S 


s! ( s “ n ) ! 


s=0 


2s—n 


(12.17) 


which amounts to replacing to by —to in Eq. (12.15). Since to is an integer (here), 
(s — to)! —*■ oc for s — 0, ..., (to — 1). Hence, the series may be considered to 
start with s = to. Replacing s by s + to, we obtain 


■J-n(x) = 

s =0 


r 

s!(s + to)! 



n+2s 


showing that J n (x) and ■/_„(.>:) are not independent but are related by 


J_ n (x) = (-l)"J n (x) (integral to). (12.18) 


These series expressions [Eqs. (12.15) and (12.17)] maybe used with TOreplaced 
by v to define J v (x) and J- V (x) for nonintegral v (compare Exercise 12.1.11). 


EXAMPLE 12.1.1 


Special Values Setting x — 0 in Eq. (12.12), using the series [Eq. (12.9)] yields 

OO 

1 = j n m n , 

n=—o o 


from which we infer (uniqueness of Laurent expansion) 


Jo(0) = 1, J„(0) = 0= to > 1. 


From t — l we find the identity 

OO OO 

1 = Jn ^ = + 2 X! 

n=— oo n= 1 

using the symmetry relation [Eq. (12.18)]. 

Finally, the identity g(—x, t) = g(x, —f) implies 

OO OO 

Jnd-X)t n = 

71 — — OO 71 =—OO 

and again the parity relations J n (—x) = (— Y) n J n (pc). These results can also be 
extracted from the identity g{—x, 1 /t) = g(pc, t). ■ 


Applications of Recurrence Relations 


We have already derived the basic recurrence relations Eqs. (12.1), (12.2), 
(12.6), and (12.7) that led us to the generating function. Many more can be 
derived as follows. 
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EXAMPLE 12.1.2 


Addition Theorem The linearity of the generating function in the exponent 
x suggests the identity 

g(u +V,t') = e (“+>0/2(«-l/0 = g(u/2X«-l/t)g(v/2)(J-l/{) _ gQu, t)g(v, f), 
which implies the Bessel expansions 


OO OO OO OO 

E j n (u+v)t n = E ■*(«)<*= E Ji(u)j k {vy +k 

n=—oo 1= —oo k=—oo k,l= —oo 

oo oo 

= E r E 

m=—oo l=—o o 

denoting m = k + l. Comparing coefficients yields the addition theorem 


OO 

l ——oo 


(12.19) 


Differentiating Eq. (12.16) partially with respect to we have 

.JEot «) = 1 (t - 1) e ™-i/0 = £ (12.20) 

Again, substituting in Eq. (12.16) and equating the coefficients of like powers 
of t, we obtain 


<4-iC*0 - 4+iW = 2 «£(»), (12.21) 

which can also be obtained by adding Eqs. (12.2) and (12.7). As a special case 
of this recurrence relation, 


JqQk) = -J\{x). 


( 12 . 22 ) 


Bessel’s Differential Equation 


Suppose we consider a set of functions Z v ( x ) that satisfies the basic recurrence 
relations [Eqs. (12.8) and (12.21)], but with v not necessarily an integer and 
Z v not necessarily given by the series [Eq. (12.15)]. Equation (12.7) may be 
rewritten (n—> v) as 


xZ’ v (x) = xZ v _\(x) — vZ v (x). (12.23) 

On differentiating with respect to x, we have 

xZ" v (x) + (v + Y)Z’ V - xZ’ v _ x - Z v —\ = 0. (12.24) 

Multiplying by x and then subtracting Eq. (12.23) multiplied by v gives us 

x 2 Z'l + xZ' v - v 2 Z v + (v - l)xZ v _! - x 2 Z' v _ 1 = 0. (12.25) 
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Now we rewrite Eq. (12.2) and replace n by v — 1: 

xZ' v _ x = (v - 1 )Z W _! - (12.26) 

Using Eq. (12.26) to eliminate Z v _\ and Z[ I _ l from Eq. (12.25), we finally get 

x 2 Z" + xZ’ v + (x 2 - v 2 )Z v = 0. (12.27) 

This is Bessel’s ODE. Hence, any functions, Z v (x), that satisfy the recurrence 
relations [Eqs. (12.2) and (12.7), (12.8) and (12.21), or (12.1) and (12.6)] satisfy 
Bessel’s equation; that is, the unknown Z v are Bessel functions. In particular, 
we have shown that the functions J„(.r), defined by our generating function, 
satisfy Bessel’s ODE. Under the parity transformation, x —> —x, Bessel’s ODE 
stays invariant, thereby relating Z v (—x) to Z, (x), up to a phase factor. If the 
argument is kp rather than x, which is the case in many physics problems, then 
Eq. (12.27) becomes 

p 2 ^-Z v {kp) + p^-Z v Qcp) + ( k 2 p 2 - v 2 )Z v (kp ) = 0. (12.28) 

dpdp 


Integral Representations 


A particularly useful and powerful way of treating Bessel functions employs 
integral representations. If we return to the generating function [Eq. (12.16)] 
and substitute t = e'°, we get 


easin'? = j 0 (a;) + 2[J 2 (x) cos 26 + J A (x) cos 4 6 + ■ ■ ■] 

+ 2i[Ji(pc)sm6 + J 3 (a;) sin 30 + •••], (12.29) 

in which we have used the relations 


Ji {x)e w + J-iixyr* = Ji(xXe ie - e~ w ) 

— 2iJi(x)sin6, (12.30) 

J2(x)e 2lB + J_2(x)e _2ie = 2J 2 (^)cos20, 

and so on. In summation notation, equating real and imaginary parts of 
Eq. (12.29), we have 

OO 

cos(xsin 0) = Jq(x) + 2 ^ J 2 n(x ) cos(2 nB'), 

n =1 
oo 

sin(xsin0) = 2 ^ J 2 „_i(a;)sin[(2TC— 1)0]. (12.31) 

n=l 

It might be noted that angle 0 (in radians) has no dimensions, just as x. Like¬ 
wise, sin 0 has no dimensions and the function f:os(.r sin 0) is perfectly proper 
from a dimensional standpoint. 
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If n and m are positive integers (zero is excluded), 3 we recall the orthog¬ 
onality properties of cosine and sine: 4 

JT 

cos nO cosmO dO = — 8 nm , (12.32) 

Lu 

7 r 

sinrtdsinmd dO — — 8 nm . (12.33) 

Multiplying Eq. (12.31) by cos nO and sin w.C, respectively, and integrating we 
obtain 



1 

r 

•J nix') 

n 

even, 



/ cos(x sin 0) cos n0 d0 = 


odd, 

(12.34) 

JT 

Jo 

o, 

n 

1 

r 

0, 

n 

even, 



/ sin(.xsin 0 ) sin n() df) = 


odd, 

(12.35) 

JT 

Jo 

<40*0, 

n 


upon employing the orthogonality relations Eqs. (12.32) and (12.33). If Eqs. 
(12.34) and (12.35) are added together, we obtain 


1 r 

J n (x) = — 1 [cos (a; sin 0) cos nO + sin(.r sin 0) sin nO\dO 

x Jo 

i r 

= — cos(nO — xsinO)dO, n— 0, 1, 2, 3,_ (12.36) 

JT Jo 

As a special case, 


1 r 

Jo 0*0 = — / cos(xsin0)dd. (12.37) 

n Jo 

Noting that cos(.'r sin ()) repeats itself in all four quadrants (Q\ = 9, 0 2 = 
jt — 9, 0 3 = jt + 9, 64 = —O'), cos(irsin0 2 ) = cos(a:sind), etc., we may write 
Eq. (12.37) as 


i r 271 

Jo 0*0 = — / cos(.rsin 0 )d 0 . (12.38) 

Ztt Jo 

On the other hand, sinOrsind) reverses its sign in the third and fourth 
quadrants so that 

1 f 2w 

— / sin(a;sinb)(//i = 0. (12.39) 

2?r Jo 

Adding Eq. (12.38) and i times Eq. (12.39), we obtain the complex exponential 
representation 


i p2n i /»2 n 

Jo(x) = — / e ixsia0 dO = — / e ixcos 9 dO. (12.40) 

2 jt Jo 2 jt Jo 


3 Equations (12.32) and (12.33) hold for either m or n = 0. If both m and n = 0, the constant in Eq. 
(12.32) becomes jt; the constant in Eq. (12.33) becomes 0. 

4 They are eigenfunctions of a self-adjoint equation (oscillator ODE of classical mechanics) and 
satisfy appropriate boundary conditions (compare Section 9.2). 
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Figure 12.2 

Fraunhofer 

Diffraction-Circular 

Aperture 



This integral representation [Eq. (12.40)] may be obtained more directly by 
employing contour integration. 5 Many other integral representations exist. 


EXAMPLE 12.1.3 


Fraunhofer Diffraction, Circular Aperture In the theory of diffraction 
through a circular aperture we encounter the integral 


4 > 



e lbrcose dOr dr 


(12.41) 


for <t>, the amplitude of the diffracted wave. Here, the parameter b is defined 
as 


b = 


2jt 

— sin a, 
X 


(12.42) 


where X is the wavelength of the incident wave, a is the angle defined by a 
point on a screen below the circular aperture relative to the normal through 
the center point, 6 and 0 is an azimuth angle in the plane of the circular aper¬ 
ture of radius a. The other symbols are defined in Fig. 12.2. From Eq. (12.40) 
we get 


r 

~ 2 jt I J{)(br)r dr. 
Jo 


(12.43) 


°For n= 0 a simple integration over 9 from 0 to 2 jt will convert Eq. (12.29) into Eq. (12.40). 

®The exponent ibr cos 9 gives the phase of the wave on the distant screen at angle a relative to the 
phase of the wave incident on the aperture at the point (?; 6 ). The imaginary exponential form of 
this integrand means that the integral is technically a Fourier transform (Chapter 15). In general, 
the Fraunhofer diffraction pattern is given by the Fourier transform of the aperture. 
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Table 12.1 

Zeros of the Bessel 
Functions and Their 
First Derivatives 


Number 

of Zeros 

JoO) 

JiO) 

■hix) 

■h(x) 

Ji O) 

Js O) 

1 

2.4048 

3.8317 

5.1356 

6.3802 

7.5883 

8.7715 

2 

5.5201 

7.0156 

8.4172 

9.7610 

11.0647 

12.3386 

3 

8.6537 

10.1735 

11.6198 

13.0152 

14.3725 

15.7002 

4 

11.7915 

13.3237 

14.7960 

16.2235 

17.6160 

18.9801 

5 

14.9309 

16.4706 

17.9598 

19.4094 

20.8269 

22.2178 


J2(*)° 

J[ O) 

JJGO 

4 O) 



1 

3.8317 

1.8412 

3.0542 

4.2012 



2 

7.0156 

5.3314 

6.7061 

8.0152 



3 

10.1735 

8.5363 

9.9695 

11.3459 




VoO) = 


Equation (12.6) enables us to integrate Eq. (12.43) immediately to obtain 


$ 


2u:ab 


b 2 


- Ji(ah) 


Xa 


sin a 


-Ji 


( 2iza 
-si 


V X 


sina 


(12.44) 


The intensity of the light in the diffraction pattern is proportional to 4> 2 and 




J\ [(2jra/k) sina] 


sin a 


(12.45) 


From Table 12.1, which lists some zeros of the Bessel functions and their 
first derivatives, 7 Eq. (12.45) will have its smallest zero at 


2na 


sina = 3.8317... 


or 


sina = 


3.8317A 

27ra 


For green light X = 5.5 x 10 7 m. Hence, if a = 0.5 cm, 

a sina = 6.7 x 10 6 (radian) ~ 14 sec of arc, 


(12.46) 


(12.47) 


(12.48) 


which shows that the bending or spreading of the light ray is extremely small, 
because most of the intensity of light is in the principal maximum. If this analy¬ 
sis had been known in the 17th century, the arguments against the wave theory 
of light would have collapsed. In the mid-20th century this same diffraction pat¬ 
tern appears in the scattering of nuclear particles by atomic nuclei—a striking 
demonstration of the wave properties of the nuclear particles. ■ 


Additional roots of the Bessel functions and their first derivatives may be found in C. L. Beattie, 
Table of first 700 zeros of Bessel functions. Bell Syst. Tech. J. 37, 689 (1958), and Bell Monogr. 

3055. 
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Orthogonality 


If Bessel’s equation [Eq. (12.28)] is divided by p, we see that it becomes self- 
adj oint, and therefore by the Sturm-Liouville theory of Section 9.2 the solutions 
are expected to be orthogonal—if we can arrange to have appropriate bound¬ 
ary conditions satisfied. To take care of the boundary conditions, for a finite 
interval [0, a], we introduce parameters a and a vm into the argument of J v to 
get J v (a vm p/a). Here, a is the upper limit of the cylindrical radial coordinate 
p. From Eq. (12.28), 


d 2 r ( P\ d 

P “T”? "v\ a vm~ I + — Jv 
dp* \ a J dp 

Changing the parameter 

d 2 ( p\ d 

P n Jv I &vn I H : Jv 
dp* \ a J dp 



to OCvni 




we find that J v (a vn p/a ) satisfies 



(12.49) 


(12.50) 


Proceeding as in Section 9.2, we multiply Eq. (12.49) by J v (a vn p/a ) and Eq. 
(12.50) by J v (a vm p/a) and subtract, obtaining 


JJ a 


p\ d 


a J dp 


dp 


J v \ a 


- Jv I a 




d 


dp J dp [ dp 


d 


Jv I a 


ol 2 — a 2 

vn vm , 


pJv\0l vm )• 


(12.51) 


Integrating from p = 0 to p = a, we obtain 


f 


T ( P \ d 
Jv I Of vn I , i~ , 

a J ap[_ ap 
- I Jv(uvm-^\ J- 


p J v l Oi vm 


dp 


7 


a) dp |_ dp 


d 


P J v l OL vn 


dp 


Oi^ — 
vn vm 


f 


Jv[ &vm )t/y|Q' v , n I p dp. 


(12.52) 


Upon integrating by parts, the left-hand side of Eq. (12.52) becomes 


pj v \ a vn -\ J- Jv 
a J dp 


P 

v I &vm 

a 


pJv( OL 


p\ d 

I ~T~ 
a J dp 


Jv ( &vn 

a 


(12.53) 


For i; > 0 the factor p guarantees a zero at the lower limit, p = 0. Actually, the 
lower limit on the index v may be extended down to v > — l. 8 At p — a, each 
expression vanishes if we choose the parameters a vn and a vm to be zeros or 


'The case v = — 1 reverts to v = +1, Eq. (12.18). 
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roots of J v ; that is, J v (a vm ) = 0. The subscripts now become meaningful: a vm 
is the mth zero of J v . 

With this choice of parameters, the left-hand side vanishes (the Sturm- 
Liouville boundary conditions are satisfied) and for m^n 

f J v (ot vm -)j v (a vn -)pdp = 0. (12.54) 

Jo V «/ V a/ 

This gives us orthogonality over the interval [0, a]. 


Normalization 


The normalization integral may be developed by returning to Eq. (12.53), set¬ 
ting a vn — a vm + e, and taking the limit e —> 0. With the aid of the recurrence 
relation, Eq. (12.2), the result may be written as 

a 2 

p dp = —[J v+ i{u vm )f ■ (12.55) 




Bessel Series 


If we assume that the set of Bessel functions <J v (a vm p / a)(v fixed, m = 1, 2, 
3, ...) is complete, then any well-behaved, but otherwise arbitrary, function 
/(p) may be expanded in a Bessel series (Bessel-Fourier or Fourier-Bessel) 


jf(p) — 'y ' c vm j v 


m= 1 


0 < p < a, 


v > 


1. 


(12.56) 


The coefficients c vm are determined by using Eq. (12.55), 

Cvm— 9 , J “7 TTV [ f(p)Jv(<Xvm—)p dp. (12.57) 

a z [J v+ i(ct vm )Y Jo \ a) 

An application of the use of Bessel functions and their roots is provided 
by drumhead vibrations in Example 12.1.4 and the electromagnetic resonant 
cavity, Example 12.1.5, and the exercises of Section 12.1. 


EXAMPLE 12.1.4 


Vibrations of a Plane Circular Membrane Vibrating membranes are of 
great practical importance because they occur not only in drums but also in 
telephones, microphones, pumps, and other devices. We will show that their 
vibrations are governed by the two-dimensional wave equation and then solve 
this PDE in terms of Bessel functions by separating variables. 

We assume that the membrane is made of elastic material of constant mass 
per unit area, p, without resistance to (slight) bending. It is stretched along 
all of its boundary in the xy -plane generating constant tension T per unit 
length in all relevant directions, which does not change while the membrane 
vibrates. The deflected membrane surface is denoted by z = z(x, y, t), and 
|s|, \dz/dx\, \dz/dy\ are small compared to a, the radius of the drumhead 
for all times t. For small transverse (to the membrane surface) vibrations of a 
thin elastic membrane these assumptions are valid and lead to an accurate 
description of a drumhead. 
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Figure 12.3 

Rectangular Patch of a 
Vibrating Membrane 




To derive the PDE we analyze a small rectangular patch of lengths dx, dy 
and area dxdy. The forces on the sides of the patch are T dx and T dy acting 
tangentially while the membrane is deflected slightly from its horizontal equi¬ 
librium position (Fig. 12.3). The angles a, ft at opposite ends of the deflected 
membrane patch are small so that cos a ~ cos ft ~ 1 , upon keeping only terms 
linear in a. Hence, the horizontal force components, T cos a, T cos ft ~ T are 
practically equal so that horizontal movements are negligible. 

The z components of the forces are T dy sin a and —Tdy sin ft at opposite 
sides in the ^-direction, up (positive) at. x + dx and down (negative) at x. Their 
sum is 


T dy (sin a — sin ft) ~ T dy (tan a. — tan ft) 


= Tdy 


d z(x + dx, y, t) 3 z(x, y, t) 


dx 


dx 


because tan a, tan ft are the slopes of the deflected membrane in the ^-direction 
at x + dx and x, respectively. Similarly, the sum of the forces at opposite sides 
in the other direction is 


T dx 


dz(x, y + dy, t) 
dy 


dz(x, y, t)~ 
dy 


According to Newton’s force law the sum of these forces is equal to the 
mass of the undeflected membrane area, p dxdy, times the acceleration; that 
is, 


d 2 z 

pdxdy — 5 - = T dx 
d t- 


dz(x, y + dy, t) dz(x, y, t )' 


Tdy 


dy 

dz(x + dx, y, t) 
dx 


dy 

dz(x, y, t) 
dx 
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Dividing by the patch area dxdy we obtain the second-order PDE 


p d 2 z 
T 3f2 


dz(x,y+dy,t) dz(x,y,t) 

dy _ dy 

dy 


+ 


dz(x+dx,y,t ) dz(x,y,t ) 

_ dx. _ dx 

dx 


or, with the constant c 2 = T/p and in the limit dx, dy -> 0 

1 d 2 Z _ d 2 Z d 2 Z 

c 2 dt 2 dx 2 dy 2 ’ 


(12.58) 


which is called the two-dimensional wave equation. 

Because there are no mixed derivatives, such as we can separate 
the time dependence in a product form of the solution z — v(t)w(x, y). 
Substituting this ,2 and its derivatives into our wave equation and dividing by 
v(t)w(x, y) yields 


1 d 2 v 1 /3 2 w 3 2 w\ 

c 2 r(f) dt 2 w{x, y) V dx 2 + dy 2 J 


Here, the left-hand side depends on the variable t only, whereas the right-hand 
side contains the spatial variables only. This implies that both sides must be 
equal to a constant, — k 2 , leading to the harmonic oscillator ODE in / and the 
two-dimensional Helmholtz equation in x and y : 


d 2 v 
dt 2 


+ krc 2 v(i) — 0, 


d 2 w 

dx 2 


d 2 w 9 

t- r + kw ( x ’ y) = °- 
dy 2 


(12.59) 


Further steps will depend on our boundary conditions and their symmetry. We 
have chosen the negative sign for the separation constant — k 2 because this 
sign will correspond to oscillatory solutions, 


v(t.) = A cos(kct) + Bsin(kct), 


in time rather than exponentially damped ones we would get for a positive 
separation constant. Note that, in our derivation of the wave equation, we 
have tacitly assumed no damping of the membrane (which could lead to a 
more complicated PDE). Besides the dynamics, this choice is also dictated by 
the boundary conditions that we have in mind and will discuss in more detail 
next. 

The circular shape of the membrane suggests using cylindrical coordinates 
so that z — z(t, p, (p). Invariance under rotations about the 2 -axis suggests no 
dependence of the deflected membrane on the azimuthal angle (p. Therefore, 
z = z(t, p), provided the initial deflection (dent) and its velocity at time t = 0 
(needed because our PDE is second order in time) 

»(0, P) = /(P), g(0, p) = p(p) 

dt 

also have no angular dependence. Moreover, the membrane is fixed at the 
circular boundary p = a at all times, z(t, a ) = 0. 
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In cylindrical coordinates, using Eq. (2.21) for the two-dimensional Lapla- 
cian, the wave equation becomes 

1 d 2 Z d 2 Z 1 dz 1 d 2 Z 

~2~af2~a~2^ -a I-2 a - 2’ (12.60) 

C z 3f z 3p z p dp 9^9^ 

where we delete the angular dependence, reducing the wave equation to 

1 d 2 Z _ d 2 Z 1 dz 
C 2 3 1 2 dp 2 ^ p dp 

Now we separate variables again seeking a solution of the product form 
z = v(t)w(p). Substituting this z and its derivatives into our wave equation [Eq. 
(12.60)] and dividing by v(t)w(p) yields the same harmonic oscillator ODE for 
i’(f ) and 


d 2 w 

dp 2 


1 dw 
p dp 


+ k 2 w(p) = 0 


instead of Eq. (12.59). Dimensional arguments suggest rescaling p r = kp 
and dividing by k 2 , yielding 


d 2 w 
dr 2 


1 dw 
r dr 


+ w(r ) = 0 , 


which is Bessel’s ODE for v = 0. 

We adopt the solution -Mr) because it is finite everywhere, whereas the 
second independent solution is singular at the origin, which is ruled out by our 
initial and boundary conditions. 

The boundary condition w(a) = 0 requires -h (ka) = 0 so that 


k = k n = Yn/a, with -My n ) = 0, n= 1,2,.... 


The zeros y\ — 2.4048,... are listed in Table 12.1. The general solution 

OO 

z(t, p ) = y>„ cos MnCt/a) + B n sm(y n ct/a)]J 0 (y n p/a) (12.61) 

n= 1 

follows from the superposition principle. The initial conditions at t — 0 require 
expanding 

00 00 v c 

/Go) = ^2 A n J 0 (y n p/a), gift) = ^ —B n J 0 (y n p/a), (12.62) 

n=l n=l ® 

where the coefficients A n , B n may be obtained by projection from these Bessel 
series expansions of the given functions /(p), g(j> ) using orthogonality prop¬ 
erties of the Bessel functions [see Eq. (12.57)] in the general case 

A n = 2r T 2 r ' 2 [ f(p)Jo(y n p/a)pdp. (12.63) 

a“[/(y»)r Jo 

A similar relation holds for the B n involving g(p). 
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Figure 12.4 

Initial Central Bang 
on Drumhead 



To illustrate a simpler case, let us assume initial conditions 

/(p) = —0.01 aJoinp/a), fif(p) = -0.1 — Jo(K 2 PM 

a 

corresponding to an initial central bang on the drumhead (Fig. 12.4) at t = 0. 
Then our complete solution (with c 2 = T/p) 


z(t, p) = —0.01aJo( 2.4048- ) cos 


2.4048c t 


a 


0.1 Jo 5.5201- sin 


5.5201ct 


a 


contains only the n = 1 and n = 2 terms as dictated by the initial conditions 
(Fig. 12.5). ■ 


EXAMPLE 12.1.5 


Cylindrical Resonant Cavity The propagation of electromagnetic waves in 
hollow metallic cylinders is important in many practical devices. If the cylinder 
has end surfaces, it is called a cavity. Resonant cavities play a crucial role in 
many particle accelerators. 

We take the 2 -axis along the center of the cavity with end surfaces at z = 0 
and z — l and use cylindrical coordinates suggested by the geometry. Its walls 
are perfect conductors so that the tangential electric field vanishes on them 
(as in Fig. 12.6): 


E z = 0 for p = a, E p — 0 = for z— 0,1. 


(12.64) 
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Figure 12.6 

Cylindrical 
Resonant Cavity 


z 
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Inside the cavity we have a vacuum so that eq^lo = 1/c 2 . In the interior of a 
resonant cavity electromagnetic waves oscillate with harmonic time depen¬ 
dence e~ l<ot , which follows from separating the time from the spatial variables 
in Maxwell’s equations (Section 1.8) so that 

1 3 2 E o 0 0 0 

V X V X E = -~ 2 -qp = k o E ’ kl = m 2 /c 2 . 

With V ■ E = 0 (vacuum, no charges) and Eq. (1.96), we obtain for the 
space part of the electric field 


V 2 E + /CgE = 0, 


which is called the vector Helmholtz PDE. The z component (E z , space part 
only) satisfies the scalar Helmholtz equation [three-dimensional generalization 
of Eq. (12.59) in Example 12.1.4] 

V 2 E Z + k 2 E z = 0. (12.65) 

The transverse electric field components Ej_ = (E p , E , p ) obey the same PDE 
but different boundary conditions given previously. 

As in Example 12.1.4, we can separate the z variable from p and cp because 
there are no mixed derivatives |^, etc. The product solution E z = v(p, <p)w(z) 
is substituted into Eq. (12.65) using Eq. (2.21) for V 2 in cylindrical coordinates; 
then we divide by vw, yielding 

1 d 2 w 1 / d 2 V 1 dv 1 d 2 V 2 A 

w(z) dz 2 v(p, ip) \3p 2 p 3p p 2 dcp 2 0 / ’ 

where k 2 is the separation constant because the left- and right-hand sides 
depend on different variables. For w(z) we find the harmonic oscillator ODE 
with standing wave solution 


w(z) = A sin kz+ B cos kz, 


with A, B constants. For i>(p, <p) we obtain 


d 2 V 1 dv 
dp 2 p 3p 


1 3 2 I) 2 

^v + ’'” =0 ' 


Y 2 =k 2 0 -k 2 . 


In this PDE we can separate the p and <p variables because there is no mixed 
term -£-§-■ The product form v = rt(p)<J>(<p) yields 


p 2 / d 2 u 
u(p) \dp 2 


1 du 

p dp 


y 2 u 


1 d 2 < t> 

$((p) dcp 2 
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where the separation constant m 2 must be an integer because the angular 
solution, <t> = e tm ‘ i ' of the ODE 


d 2 <J> o 
—r + wr<t> = 0 , 
d<p 2 


must be periodic in the azimuthal angle. 
This leaves us with the radial ODE 


d 2 u 
dp 2 


1 du 
p dp 



m 2 \ u 

p 2 ) 


= 0 . 


Dimensional arguments suggest rescaling p -> r = yp and dividing by y 2 , 
which yields 



1 du 
r dr 




= 0 . 


This is Bessel’s ODE for v =m. 

We use the regular solution J m (yp) because the (irregular) second indepen¬ 
dent solution is singular at the origin, which is unacceptable. The complete 
solution is 


E z = J m (yp)e im<p (A sin kz + B cos kz), (12.66) 


where the constant y is determined from the boundary condition E z = 0 
on the cavity surface p — a (i.e., that ya be a root of the Bessel function J m ) 
(Table 12.1). This gives a discrete set of values y = y mn , where n designates 
the nth root of J m (Table 12.1). 

For the transverse magnetic (TM) mode of oscillation with H z = 0, 
Maxwell’s equations imply (see “Resonant Cavities” in J. D. Jackson’s Elec¬ 
trodynamics) 


Ex 



Vi = 




The form of this result suggests E z ~ cos kz, that is, setting A = 0, so that 
Ex ~ sin kz = 0 at z = 0 , l can be satisfied by 

k = P = 0, 1, 2, .... (12.67) 

Thus, the tangential electric fields E p and E, ( vanish at z = 0 and l. In other 
words, A = 0 corresponds to dE z /dz = 0 at z = 0 and z = a for the TM mode. 
Altogether then, we have 


with 



co 2 p 2 n 2 
d 2 ~ l 2 ’ 


Y — Ymn : 


*mn 

a 


( 12 . 68 ) 


(12.69) 
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where a mn is the nth zero of J m . The general solution 

Ez=Y, J ™(YmnP)e ±im ‘ p B mnp cos (12.70) 

m,n,p 


with constants B mnp , now follows from the superposition principle. 

The consequence of the two boundary conditions and the separation con¬ 
stant m 2 is that the angular frequency of our oscillation depends on three 
discrete parameters: 


^mnp 



P 2 7T 2 


m— 0, 1, 2, ... 
n= 1, 2, 3,... 
V = 0 , 1 , 2 ,... 


(12.71) 


These are the allowed resonant frequencies for the TM mode. Feynman et al. 9 
develops Bessel functions from cavity resonators. ■ 


Bessel Functions of Nonintegral Order 


The generating function approach is very convenient for deriving two recur¬ 
rence relations, Bessel’s differential equation, integral representations, ad¬ 
dition theorems (Example 12.1.2), and upper and lower bounds (Exercise 
12.1.1). However, the generating function defines only Bessel functions of 
integral order Jo, Jl, J>, and so on. This is a limitation of the generating func¬ 
tion approach that can be avoided by using a contour integral (Section 12.3) 
instead. However, the Bessel function of the first kind, J, (x), may easily be 
defined for nonintegral v by using the series [Eq. (12.15)] as a new definition. 

We have verified the recurrence relations by substituting in the series form 
of J v (pc). From these relations Bessel’s equation follows. In fact, if v is not an 
integer, there is actually an important simplification. It is found that J and J_ v 
are independent because no relation of the form of Eq. (12.18) exists. On the 
other hand, for v = n, an integer, we need another solution. The development 
of this second solution and an investigation of its properties are the subject of 
Section 12.2. 


EXERCISES 

12.1.1 From the product of the generating functions g(x, t) • g(x, —t) show 
that 

1 = [J 0 Cr)] 2 + 2[J 1 (x)f + 2 [J 2 (x)f + ■■■ 

and therefore that Jn(.x) < 1 and J„(x) < 1/V2, n— 1, 2, 3,.... 
Hint. Use uniqueness of power series (Section 5.7). 


9 Feynman, R. R, Leighton, R. B., and Sands, M. (1964). The Feynman Lectures on Physics , Vol. 2, 
Chap. 23. Addison-Wesley, Reading, MA. 
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12.1.2 Derive the Jacobi-Anger expansion 

OO 

e“c° se = i m Jm(z)e im0 . 

m=—oo 

This is an expansion of a plane wave in a series of cylindrical waves. 

12.1.3 Show that 

OO 

(a) cos a; = J 0 (x) + 2 y^(-l)V 2 w Qr), 

n =1 


(b) sinx = 2 J](-l ) n+1 J 2 m+ i(x). 


12.1.4 Prove that 


sin x 


x 


L 


n =1 

71/2 T , % 1 - cosx 

Jo (pc cos 0) cos 0 dO, 


X 


rx /2 

/ Ji(pccos0)d0. 
Jo 


Hint. The definite integral 

r n/2 


f 

JO 


cos 


2 s+l . 


d0 = 


2 • 4 • 6 • • • (2s) 

1 • 3 ■ 5 • • ■ (2s + 1) 


may be useful. 
12.1.5 Show that 


Jo 0*0 = - 


2 f 1 cos a 
X Jo s/V 1 


xt 

¥ 


: dt. 


This integral is a Fourier cosine transform (compare Section 15.4). 
The corresponding Fourier sine transform, 

sin xt 


Jo (X) = - 


_ 2 r° 

n J i 


: dt, 


Vt 2 -i 

is established using a Hankel function integral representation. 
12.1.6 Derive 

1 d 


j n (x) = a-iyw 


xdx 


J 0 Or). 


Hint. Rewrite the recursion so that it serves to step from n to n+ 1 in 
a proof by mathematical induction. 

12.1.7 Show that between any two consecutive zeros of -/„(.%') there is one 
and only one zero of J„ + i(.x). 

Hint. Equations (12.1) and (12.6) may be useful. 

12.1.8 An analysis of antenna radiation patterns for a system with a circular 
aperture involves the equation 


g(u) = / f(f)Jo(ur)rdr. 
Jo 


If f(r)= 1 — r , show that 


9(u) = -jJ 2 (u). 


vr 
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12.1.9 The differential cross section in a nuclear scattering experiment is 
given by da /<IQ = /(d) | 2 . An approximate treatment leads to 

—ik f 2yT f R 

f(6) = —— / / exp[ikp sind sin<p]p dp d(p, 

J o Jo 

where 6 is an angle through which the scattered particle is scattered, 
and R is the nuclear radius. Show that 

da r r >2 1 r=/l(/ti?sin0)"| 2 

d£2 7T \_ sind 

12.1.10 A set of functions C n (x) satisfies the recurrence relations 

2 n 

C n -l(x) - C n+ i(x) = —Cndx), 

X 

C n -i(x) + C n+ i(ar) = 2C' n (x'). 

(a) What linear second-order ODE does the C n (pc) satisfy? 

(b) By a change of variable, transform your ODE into Bessel’s equa¬ 
tion. This suggests that C n (x) may be expressed in terms of Bessel 
functions of transformed argument. 


12.1.11 (a) From 

with a suitably defined contour, derive the recurrence relation 

= -J v (x) - J v+ iO). 

X 

(b) From 

J v (x) — — [ dt 

2m J 

with the same contour, derive the recurrence relation 


’K( X ) = ^Wv-l(x) - J v+ l(x)]. 
12.1.12 Show that the recurrence relation 

Jn(p) = \\ J n -lW - <4+10*01 

follows directly from differentiation of 

1 r 

J n (x ) = — cos (n6 — xsiwOJdO. 
x Jo 


12.1.13 Evaluate 


e “ r Joibxjdx, a,b> 0. 

Actually the results hold for a > 0, — oo < b < oo. This is a Laplace 
transform of Jo- 
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Hint. Either an integral representation of Jo or a series expansion or 
a Laplace transformation of Bessel’s ODE will be helpful. 


12.1.14 Using trigonometric forms [Eq. (12.29)], verify that 

f*2n 

e il 


*=& 


if 

Jo 


„ibr sin 6 


12.1.15 The fraction of light incident on a circular aperture (normal incidence) 
that is transmitted is given by 

t* 2 ka / o i 

Mx) ( 


f 

T = 

Jo 


dx, 


y x 2ka / 

where a is the radius of the aperture, and k is the wave number, 2n/X. 
Show that 


i oo I n2ka 

(a) T = 1 - — V J 2 -H +1 (2k a), (b) T = 1 - — / J 0 

KG' n= 0 aICCI Jo 


(pc') dx. 


12.1.16 Show that, defining 

r 

Im,n(a) = / x m J n (x)dx , m>n> 0, 

Jo 

( 3 ) 73 , 0 ( 0 :)= / t 3 Jo(t)dt = o: 3 JiOr) — 2o: 2 J2(o:); 

Jo 

(b) is integrable in terms of Bessel functions and powers of a: [such 
as a(a)] for m+n odd; 

(c) may be reduced to integrated terms plus / 0 “ Jo(x)dx for m+n 
even. 


12.1.17 Solve the ODE x 2 y"(x) + axy’{x ) + (1 + b 2 x 2 )y(x) = 0, where a, b 
are real parameters, using the substitution y(x) — x~ n v(x). Adjust n 
and a so that the ODE for v becomes Bessel’s ODE for Jo- Find the 
general solution y(x) for this value of a. 


12.2 Neumann Functions, Bessel Functions of the Second Ki nd 


From the theory of ODEs it is known that Bessel’s second-order ODE has 
two independent solutions. Indeed, for nonintegral order v we have already 
found two solutions and labeled them J. (x) and J- l: (x) using the infinite series 
[Eq. (12.15)]. The trouble is that when v is integral, Eq. (12.18) holds and we 
have but one independent solution. A second solution may be developed by 
the methods of Section 8.6. This yields a perfectly good second solution of 
Bessel’s equation but is not the standard form. 



Definition and Series Form 


As an alternate approach, we take the particular linear combination of J v (x ) 
and J- V (x) 

T ., . cos(v7r)Jy(a:) - J- V (x ) 

YJx) = -:- 


sin vn 


(12.72) 
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Figure 12.7 

Neumann Functions 
Yo(x), Fi(x), andF 2 C jr ) 



This is the Neumann function (Fig. 12.7). 10 For nonintegral v, Y v (x ) clearly sat¬ 
isfies Bessel’s equation because it is a linear combination of known solutions, 
■J v (x) and J- V (x). Substituting the power series [Eq. (12.15)] for n —► v yields 


Y v (x) = - 


(y ~ 1 )! 

71 



(12.73) 


for v > 0. However, for integral v, Eq. (12.18) applies and Eq. (12.58) * 11 becomes 
indeterminate. The definition of Y v (x) was chosen deliberately for this indeter¬ 
minate property. Again substituting the power series and evaluating Y v (x) for 
v —> 0 by l’Hopital’s rule for indeterminate forms, we obtain the limiting value 

Y 0 (x) = -Qnx+ y - In2) + 0(x 2 ) (12.74) 

71 

for n= 0 and x —► 0, using 

v!(—y)! = nV (12.75) 

sin7ry 

from Eq. (10.32). The first and third terms in Eq. (12.74) come from using 
(d/dv)(x/2) v = (x/2)' ln(;r/2), whereas y comes from (d/dv)v\ for v —> 0 


10 We use the notation in AMS-55 and in most mathematics tables. 

11 Note that this limiting form applies to both integral and nonintegral values of the index v. 
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using Eqs. (10.38) and (10.40). For n > 0 we obtain similarly 


Y n (x) = 


(n- l)!/2 


7r 


x 


X 


7r V 2 / m! 


-hr —In — 


X 


(12.76) 


Equations (12.74) and (12.76) exhibit the logarithmic dependence that was to 
be expected. This, of course, verifies the independence of J n and Y n . 



Other Forms 


As with all the other Bessel functions, Y v (x ) has integral representations. For 
ioCr) we have 

2 f°° 2 cos (xt) 

Yo(x) — -/ cos(xcosh t)dt = -/ —--— dt, x>0. 

x Jo n J i 0“ - l) v “ 

These forms can be derived as the imaginary part of the Hankel representations 
of Section 12.3. The latter form is a Fourier cosine transform. 

The most general solution of Bessel’s ODE for any v can be written as 

y(x) = AJ v (x ) + BY v (x). (12.77) 

It is seen from Eqs. (12.74) and (12.76) that Y„ diverges at least logarithmi¬ 
cally. Some boundary condition that requires the solution of a problem with 
Bessel function solutions to be finite at the origin automatically excludes Y n (x). 
Conversely, in the absence of such a requirement Y n (x) must be considered. 


Biographical Data 

Neumann, Karl. Neumann, a German mathematician and physicist, was 
bom in 1832 and died in 1925. He was appointed a professor of mathematics 
at the University of Leipzig in 1868. His main contributions were to potential 
theory and partial differential and integral equations. 


Recurrence Relations 


Substituting Eq. (12.72) for Y v (pc) (nonintegral v) or Eq. (12.76) (integral v) into 
the recurrence relations [Eqs. (12.8) and (12.21)] for J„(x), we see immediately 
that Y v (x) satisfies these same recurrence relations. This actually constitutes 
another proof that Y v is a solution. Note that the converse is not necessarily 
true. All solutions need not satisfy the same recurrence relations because Y v 
for nonintegral v also involves J_„ ^ J„ obeying recursions with v -> —v. 


gUgg Wronskian Formulas 

From Section 8.6 and Exercise 9.1.3 we have the Wronskian formula 12 for 
solutions of the Bessel equation 

Uy(x)v' v (x) - u' v (x)v v (x) = —, (12.78) 

oc 


12 This result depends on P(pc) of Section 8.5 being equal to p'(x)/p(x), the corresponding coeffi¬ 
cient of the self-adjoint form of Section 9.1. 
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EXAMPLE 12.2.1 


in which A v is a parameter that depends on the particular Bessel functions u v (pc') 
and v v (x) being considered. It is a constant in the sense that it is independent 
of x. Consider the special case 


u v (x) = J v (x), v v (x) = J- V (x), 


(12.79) 


J V J’_ V -J' V J- V = —. (12.80) 

x 

Since A v is a constant, it may be identified using the leading terms in the power 
series expansions [Eqs. (12.15) and (12.17)]. All other powers of x cancel. We 
obtain 


J v x7(2”v!), J_„ 2 V x~ v / (—v)\, 

J' -> va; v_1 /(2 v v!), J'_ v -> -v2 v x~ v ~ 1 /(-v)\. 

Substitution into Eq. (12.80) yields 

—2v 2sinv7r 

J v (x)J_ v (x) - J v (x)J- v (x) = —-—- =-. 


(12.81) 


(12.82) 


Note that A v vanishes for integral v, as it must since the nonvanishing of the 
Wronskian is a test of the independence of the two solutions. By Eq. (12.18), 
J n and are clearly linearly dependent. 

Using our recurrence relations, we may readily develop a large number of 
alternate forms, among which are 

JvJ-v+i + J— V J V — i = — -- , (12.83) 

TXX 

J v J—v— i + J—vJv+i — -'- 1 (12.84) 

TTX 

= —, ( 12 . 85 ) 

TXX 

J V Y V+1 - J V+1 Y V = - —. (12.86) 

TTX 


Many more can be found in Additional Reading. 

The reader will recall that in Chapter 8 Wronskians were of great value in 
two respects: (i) in establishing the linear independence or linear dependence 
of solutions of differential equations and (ii) in developing an integral fonn of 
a second solution. Here, the specific forms of the Wronskians and Wronskian- 
derived combinations of Bessel functions are useful primarily to illustrate the 
general behavior of the various Bessel functions. Wronskians are of great use 
in checking tables of Bessel functions numerically. 


Coaxial Waveguides We are interested in an electromagnetic wave confined 
between the concentric, conducting cylindrical surfaces p = aandp = b. Most 
of the mathematics is worked out in Example 12.1.5. That is, we work in cylin¬ 
drical coordinates and separate the time dependence as before, which is that 
of a traveling wave e l( - kz ~ wt ^ now instead of standing waves in Example 12.1.5. 
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To implement this, we let A = iB in the solution w(z') = A sin kz + B cos kz 
and obtain 


E z = K m Jra{yp)e ±im ^e i{kz -^. (12.87) 

m,n 

For the coaxial waveguide both the Bessel and Neumann functions contribute 
because the origin p = 0 can no longer be used to exclude the Neumann func¬ 
tions because it is not part of the physical region (0 < a < p < b~). It is 
consistent that there are now two boundary conditions, at p = a and p — b. 
With the Neumann function Y m (yp'), E z (p, <p, z, t) becomes 

E Z = J2 [iwWjwO + c mK F m ( KmmP )]e ±im (12.88) 

m,n 

where y mn will be determined from boundary conditions. With the transverse 
magnetic field condition 


H z = 0 (12.89) 

everywhere, we have the basic equations for a TM wave. 

The (tangential) electric field must vanish at the conducting surfaces 
(Dirichlet boundary condition), or 


b tun'hrXV m nA ) 4“ C mn Ym(Ym nP) — 0, (12.90) 

bmti.Jin ) 4" (‘ mn Y»,(y run ^) = 0. (12.91) 

For a nontrivial solution b mn , c mn of these homogeneous linear equations to 
exist, their determinant must be zero. The resulting transcendental equation, 
Jm(ymnO)Y m (y mn b) = J m (y m . n b')Y m (y mr a), may be solved for y mn , and then the 
ratio c mn /b mn can be determined. From Example 12.1.5, 

(D 2 

k 2 = ftJ 2 M0£0 ~Ymn = ~2 ~ Vm-TP (12.92) 

where c is the velocity of light. Since k 2 must be positive for an oscillatory 
solution, the minimum frequency that will be propagated (in this TM mode) is 

Co — y mn c , (12.93) 

with y mn fixed by the boundary conditions, Eqs. (12.90) and (12.91). This is the 
cutoff frequency of the waveguide. In general, at any given frequency only a 
finite number of modes can propagate. The dimensions (a < b ) of the cylindri¬ 
cal guide are often chosen so that, at given frequency, only the lowest mode k 
can propagate. 

There is also a transverse electric mode with E z = 0 and H z given by 
Eq. (12.88). ■ 
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SUMMARY 


To conclude this discussion of Neumann functions, we introduce the Neumann 
function, F„ (•%'), for the following reasons: 

1. It is a second, independent solution of Bessel’s equation, which completes 
the general solution. 

2. It is required for specific physical problems, such as electromagnetic waves 
in coaxial cables and quantum mechanical scattering theory. 

3. It leads directly to the two Hankel functions (Section 12.3). 


EXERCISES 

12.2.1 Prove that the Neumann functions Y n (with n an integer) satisfy the 
recurrence relations 

2 yi 

Yn- l(x) + Y n+ i(x) = — Y n (x) 

X 

Yn-! (a?) - Y n+l (x) = 2Y' n {x). 


Hint. These relations may be proved by differentiating the recurrence 
relations for J v or by using the limit form of Y v but not dividing every¬ 
thing by zero. 

12.2.2 Show that 


Y- n (_x) = (~iy%Cr). 


12.2.3 Show that 


Y'Cx) = —Y\(pc). 

12.2.4 If Y and Z are any two solutions of Bessel’s equation, show that 

Y v (x)Z' v (x) - r v (x)Z v (x) = —, 

oc 

in which A v may depend on v but is independent of x. This is a special 
case of Exercise 9.1.3. 

12.2.5 Verily the Wronskian formulas 

J v (x)J- v+ i(x) + J- v (x)J v -i(x) = 2 Sln V7t , 

TXX 

J v (x)Y'(x ) - J’ v (x)Y v (x) = —. 

TXX 

12.2.6 As an alternative to letting x approach zero in the evaluation of the 
Wronskian constant, we may invoke uniqueness of power series 
(Section 5.7). The coefficient of x~ x in the series expansion of 
u v (x)v[.(x) — u' v (x)v v (x) is then A v . Show by series expansion that the 
coefficients of xP and x ] of J v (x)J'_ v (x) — Jy(x)J- v (x) are each zero. 
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12.2.7 (a) By differentiating and substituting into Bessel’s ODE, show that 


r 

Jo 


cos(a;cosht)d£ 


is a solution. 

Hint. You can rearrange the final integral as 



d 

dt. 


{x sinQr cosh t ) sinh t } d t. 


(b) Show that 


2 r°° 

Yo(x~) =-/ cos(a;coshf)d£ 

7T JO 

is linearly independent of Jq(x). 


12.3 Asymptotic Expansions 


Frequently, in physical problems there is a need to know how a given Bessel 
function behaves for large values of the argument, that is, the asymptotic 
behavior. This is one occasion when computers are not very helpful, except 
in matching numerical solutions to known asymptotic forms or checking an 
asymptotic guess numerically. One possible approach is to develop a power 
series solution of the differential equation, as in Section 8.5, but now 
using negative powers. This is Stokes’s method. The limitation is that start¬ 
ing from some positive value of the argument (for convergence of the series), 
we do not know what mixture of solutions or multiple of a given solution 
we have. The problem is to relate the asymptotic series (useful for large 
values of the variable) to the power series or related definition (useful for 
small values of the variable). This relationship can be established by intro¬ 
ducing a suitable integral representation and then using either the method 
of steepest descent (Section 7.3) or the direct expansion as developed in this 
section. 

Integral representations have appeared before: Eq. (10.35) for T(s) and 
various representations of ■/„ (z) in Section 12.1. With these integral represen¬ 
tations of the Bessel (and Hankel) functions, it is perhaps appropriate to ask 
why we are interested in integral representations. There are at least four rea¬ 
sons. The first is simply aesthetic appeal. Second, the integral representations 
help to distinguish between two linearly independent solutions (Section 7.3). 
Third, the integral representations facilitate manipulations, analysis, and the 
development of relations among the various special functions. Fourth, and 
probably most important, the integral representations are extremely useful 
in developing asymptotic expansions. One approach, the method of steepest 
descents, appears in Section 7.3 and is used here. 
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Figure 12.8 

Bessel Function 
Contour 



Hankel functions are introduced here for the following reasons: 

• As Bessel function analogs of e ±lx they are useful for describing traveling 
waves. 

• They offer an alternate (contour integral) and an elegant definition of Bessel 
functions. 


Expansion of an Integral Representation 

As a direct approach, consider the integral representation (Schlaefli integral) 


J v (z) = — f e W2Xt-i/0 r v-i df> 

2 TTl Jc 


(12.94) 


with the contour C around the origin in the positive mathematical sense dis¬ 
played in Fig. 12.8. This formula follows from Cauchy’s theorem, applied to 
the defining Eq. (12.9) of the generating function given by Eq. (12.16) as the 
exponential in the integral. This proves Eq. (12.94) for —n< arg z < 2 tt, but 
only for v = integer. If v is not an integer, the integrand is not single-valued 
and a cut line is needed in our complex 1 plane. Choosing the negative real 
axis as the cut line and using the contour shown in Fig. 12.8, we can extend 
Eq. (12.94) to nonintegral v. For this case, we still need to verify Bessel’s ODE 
by substituting the integral representation [Eq. (12.94)], 


Z 2 J"(z) v + zJ' v (z ) + ( 2 2 - V 2 )J v (z) 


1 

27 ri 


L 


e ( z / 2 )(«— l /() r „—1 



2 



(12.95) 


where the integrand can be verified to be the following exact derivative that 
vanishes as t —»■ ooe ±m : 


d 

dt 






(12.96) 


Hence, the integral in Eq. (12.95) vanishes and Bessel’s ODE is satisfied. 
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Figure 12.9 

Hankel Function 
Contours 



We now deform the contour so that it approaches the origin along the 
positive real axis, as shown in Fig. 12.9. This particular approach guarantees 
that the exact derivative in Eq. (12.96) will vanish as t —> 0 because of the 
e~ z/2t factor. Hence, each of the separate portions corresponding to ooe~ m to 
0 and 0 to ooe m is a solution of Bessel’s ODE. We define 

H?\z)=±J o e™~ i/O^, (12.97) 

( i2.98) 

ft ^ J ooe~ in “ 

so that 

JM = \ W l Xz) + (12.99) 

These expressions are particularly convenient because they may be handled 
by the method of steepest descents (Section 7.3). H[ X) (z) has a saddle point at 
t = +£, whereas Hfp(z) has a saddle point at t = ~i. To leading order Eq. (7.84) 
yields 


H^Xz) 




( 12 . 100 ) 


for large | z\ in the region — n < arg z < 2tt. The second Hankel function is just 
the complex conjugate of the first (for real argument z) so that 



JT 

V+ ~ I - 


2/2 


for large \z\ with — 27r < arg z < n. 

In addition to Eq. (12.99) we can also show that 

Y v (z) = 


( 12 . 101 ) 


( 12 . 102 ) 
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This may be accomplished by the following steps: 

1. With the substitutions t = e m /s for 77® in Eq. (12.97) and t = e~ l7I /s for 
77® in Eq. (12.98), we obtain 

H^\z) = e~ ivn (12.103) 

77®(z) = e ivn H^liz). (12.104) 

2. From Eqs. (12.99) (v —>■ — v), (12.103), and (12.104), we get 

J-M = \[e im H^Xz) + e- iviz Hf\z)\. (12.105) 

3. Finally, substitute J v [Eq. (12.99)] and [Eq. (12.105)] into the defining 
equation for Y v , Eq. (12.72). This leads to Eq. (12.102) and establishes the 
contour integrals [Eqs. (12.97) and (12.98)] as the standard Hankel func¬ 
tions. 

Since J v (z) is the real part of H^\z) for real z, 


Uz) 



(12.106) 


for large \z\ with —jr < arg z < n. The Neumann function is the imaginary part 
of H[ X) (z) or 


YM 



(12.107) 


for large \z\ with — jr < arg z < n. 

It is of interest to consider the accuracy of the asymptotic forms, taking 
only the first term [Eq. (12.106)], for example (Fig. 12.10). Clearly, the condition 
for the validity of Eq. (12.106) is that the next (a nonleading sine) term in 
Eq. (12.106) be negligible; estimating this leads to 8.r 4n 2 — 1. 



Asymptotic 
Approximation of 

J 0 (x) 
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To a certain extent the definition of the Neumann function Y„ (.%') is arbitrary. 
Equations (12.72) and (12.76) contain terms of the form a, a J n {x). Clearly, any 
finite value of the constant On would still give us a second solution of Bessel’s 
equation. Wiry should a,, have the particular value implicit in Eqs. (12.72) 
and (12.76)? The answer is given by the asymptotic dependence developed 
here. If J n corresponds to a cosine wave asymptotically [Eq. (12.106)], then 
Y n corresponds to a sine wave [Eq. (12.107)]. This simple and convenient 
asymptotic phase relationship is a consequence of the particular admixture of 
Jn in 1) ■ 

This completes our determination of the asymptotic expansions. How¬ 
ever, it is worth noting the primary characteristics. Apart from the ubiquitous 
2 T 1/2 , J (z), and Y v (z) behave as cosine and sine, respectively. The zeros are 
almost evenly spaced at intervals of 7r; the spacing becomes exactly i r in the 
limit as s -» oo. The Hankel functions have been defined to behave like the 
imaginary exponentials. This asymptotic behavior may be sufficient to elimi¬ 
nate immediately one of these functions as a solution for a physical problem. 
This is illustrated in the next example. 


EXAMPLE 12.3.1 


Cylindrical Traveling Waves As an illustration of the use of Hankel func¬ 
tions, consider a two-dimensional wave problem similar to the vibrating circu¬ 
lar membrane of Example 12.1.4. Now imagine that the waves are generated at 
p = 0 and move outward to infinity. We replace our standing waves by traveling 
ones. The differential equation remains the same, but the boundary conditions 
change. We now demand that for large p the solution behaves like 


JJ g i(kp-mt) 


(12.108) 


to describe an outgoing wave. As before, k is the wave number. This as¬ 
sumes, for simplicity, that there is no azimuthal dependence, that is, no angular 
momentum, or m— 0. In Sections 7.4 and 12.3, // ( I ) 1) (A' p) is shown to have the 
asymptotic behavior (for p —»■ oo) 

HJ\kp)e ikp . (12.109) 

This boundary condition at infinity then determines our wave solution as 

U(p,i) = {kp)e~ iwt . (12.110) 

This solution diverges as p —»■ 0, which is just the behavior to be expected with 
a source at the origin representing a singularity reflected by singular behavior 
of the solution. ■ 


The choice of a two-dimensional wave problem to illustrate the Hankel 
function H$p(z) is not accidental. Bessel functions may appear in a variety of 
ways, such as in the separation in conical coordinates. However, they enter 
most commonly from the radial equations from the separation of variables in 
the Helmholtz equation in cylindrical and in spherical polar coordinates. We 
have used a degenerate form of cylindrical coordinates for this illustration. 
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Had we used spherical polar coordinates (spherical waves), we should have 
encountered index v = n + \,n an integer. These special values yield the 
spherical Bessel functions discussed in Section 12.4. 

Finally, as pointed out in Section 12.2, the asymptotic forms may be used 
to evaluate the various Wronskian formulas (compare Exercise 12.3.2). 


Numerical Evaluation 


When a computer program calls for one of the Bessel or modified Bessel func¬ 
tions, the programmer has two alternatives: to store all the Bessel functions 
and tell the computer how to locate the required value or to instruct the com¬ 
puter to simply calculate the needed value. The first alternative would be fairly 
slow and would place unreasonable demands on storage capacity. Thus, our 
programmer adopts the “compute it yourself” alternative. 

Let us discuss the computation of J n (x ) using the recurrence relation 
[Eq. (12.8)]. Given J 0 and J\, for example, -L (and any other integral order 
J„) may be computed from Eq. (12.8). With the opportunities offered by com¬ 
puters, Eq. (12.8) has acquired an interesting new application. In computing 
a numerical value of -(v (xo) for a given xq, one could use the series form of 
Eq. (12.15) for small x or the asymptotic form [Eq. (12.106)] for large x. A bet¬ 
ter way, in terms of accuracy and machine utilization, is to use the recurrence 
relation [Eq. (12.8)] and work down. 13 With n^> N and n xo, assume 


Jn+i(x 0 ) = 0 and J n (xo) = a, 


where a is some small number. Then Eq. (12.8) leads to J n -i(xo), •/„- 2 (•*'«), and 
so on, and finally to Jq(xq). Since a is arbitrary, the J n are all off by a common 
factor. This factor is determined by the condition 

OO 

Jo(%0) + 2 J2m(pCo') = I- 

m= 1 

(See Example 12.1.1.) The accuracy of this calculation is checked by trying 
again at ri =n+ 3. This technique yields the desired J^(_x 0 ) and all the lower 
integral index J n down to Jo, and it avoids the fatal accumulation of rounding 
errors in a recursion relation that works up. High-precision numerical 
computation is more or less an art. Modifications and refinements of this and 
other numerical techniques are proposed every year. For information on the 
current “state of the art,” the student will have to consult the literature, such as 
Numerical Recipes in Additional Reading of Chapter 8, Atlas for Computing 
Mathematical Functions in Additional Reading of Chapter 13, or the journal 
Mathematics of Computation. 


13 Stegun, I.A., and Abramowitz, M. (1957). Generation of Bessel functions on computers. Math. 
Tables Aids Comput. 11 , 255-257. 
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Table 12.2 


Equations for the 
Computation of 
Neumann Functions 

Note: In practice, it is 
convenient to limit the 
series (power or 
asymptotic) computation of 
Y n (x) to n = 0, 1. Then 
Y n (x), n > 2 is computed 
using the recurrence 
relation, Eq. (12.8). 



Power Series 

Asymptotic Series 

Y n (x) 

Eq. (12.76), x < 4 

Eq. (12.107), x> 4 


For Y n , the preferred methods are the series if x is small and the asymptotic 
forms (with many terms in the series of negative powers) if x is large. The 
criteria of large and small may vary as shown in Table 12.2. 


EXERCISES 


12.3.1 In checking the normalization of the integral representation of J v (z ) 
[Eq. (12.94)], we assumed that Y v (z) was notpresent. How do we know 
that the integral representation [Eq. (12.94)] does not yield J v (z) + 
sY v (z ) with e ^ 0 albeit small? 

12.3.2 Use the asymptotic expansions to verify the following Wronskian for¬ 
mulas: 

(a) J v (pc)J- v - 1(3:)+ J- v {x~)J v+ i(x) = —2 sin vu/tix, 

(b) J v (x)Y v+ iGr) - J v+1 (x)Y v Qjc) = - 2/ttx , 

(c) J v (x)H&(x) - J v -i(x)HW(_x) = 2/mx. 

12.3.3 Stokes’s method. 

(a) Replace the Bessel function in Bessel’s equation by x~ l/2 y(pc) and 
show that y(x) satisfies 


lJ' (.x) + 



3/0*0 = 0. 


(b) Develop a power series solution with negative powers of x starting 
with the assumed form 

OO 

y(x) — e lx ^2 <hiX~ n . 

n= 0 


Determine the recurrence relation giving a n+ \ in terms of a n . Check 
your result against the asymptotic formula, Eq. (12.106). 

(c) From the results of Section 7.3, determine the initial coefficient, n (l . 

12.3.4 (a) Write a subroutine that will generate Bessel functions that 
is, will generate the numerical value of J n (x) given x and n. 

(b) Check your subroutine by using symbolic software, such as Maple 
and Mathematica. If possible, compare the machine time needed 
for this check for several n and x with the time required for your 
subroutine. 
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12.4 Spherical Bessel Functions 


When the Helmholtz equation is separated in spherical coordinates the radial 
equation has the form 


+ 2 r— + [A; 2 r 2 — n(n+ l)]f? = 0. (12.111) 

dr z dr 

This is Eq. (8.62) of Section 8.5, as we now show. The parameter k enters 
from the original Helmholtz equation, whereas n(n + 1) is a separation con¬ 
stant. From the behavior of the polar angle function (Legendre’s equation; 
Sections 4.3, 8.9, and 11.2), the separation constant must have this form, with 
n a nonnegative integer. Equation (12.111) has the virtue of being self-adjoint, 
but clearly it is not Bessel’s ODE. However, if we substitute 


R(kr ) = 


Z(kr ) 
(fcr ) 1 / 2 ’ 


Equation (12.111) becomes 


,d 2 Z dZ 
r“—^ + r — 
dr 2 dr 



Z = 0, 


( 12 . 112 ) 


whichis Bessel’s equation. Z is a Bessel function of ordern+ 5 (to an integer). 
Because of the importance of spherical coordinates, this combination, 


Z n+ i/2(kr) 

(fcr) 1 / 2 ’ 


occurs often in physics problems. 



Definitions 


It is convenient to label these functions spherical Bessel functions, with the 
following defining equations: 

3n 0*0 = \f^. J n+ 1 / 20 * 0 , 

Vn 0*0 = J~^. Y n+ 1/2 0*0 = (-iy +1 J^ J -n-l/2(x), U 


h^\x) = = j n (x) + iy n (x), (12.113) 

h= inW - Wn(x)- 


14 This is possible because cos (n + g)7r = 0 for n an integer. 
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Figure 12.11 

Spherical Bessel 
Functions 



These spherical Bessel functions (Figs. 12.11 and 12.12) can be expressed 

l. 

2 * 


in series form by using the series [Eq. (12.15)] for J n , replacing n with n - 1 - -■ 


Jn+ 1/2 Or) — y ] 


(-1 ) s 


X 


2s+n+l/2 


^s!(s + w+ §)! \ 2 / 
Using the Legendre duplication formula (Section 10.1), 


z'-lz+l )! = 2 


—2s—1 


jc 1/2 (2z + 1)!, 


(12.114) 


(12.115) 


we have 


“ (—l) s 2 2s+2m+1 (s + re)! /a:\ 2s+m+1/2 

3 '^ ) “ v 2x ^ 7 t 1 / 2 ( 2 s + 2m+1)!s! 1,2 J 


o» nv' (-l) s (s + »)! 2 S 

2 i > -ar . 

~o s!(2s + 2re + 1)! 


(12.116) 
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Figure 12.12 

Spherical Neumann 
Functions 



Now 


Y n+ l/2(a) = (— 1) B+ J-n-l/2 (%) = (-1 — 


from Eq. (12.72), and from Eq. (12.15) we find that 




(-1) S 


X 


This yields 


y n {x) = (-1)™ +1 


=o s! ( s ~ n ~ l) ! V 2 

2 m 7T 1/2 ^ (-1) 


E 


2s—n— 1/2 


X 


2s 


(12.117) 


(12.118) 


(12.119) 


x*+' j^ s \(s-n -\)! V2 
We can also use Eq. (12.117), replacing n —s —n — 1 in Eq. (12.116) to give 


2/»(x) = 


(- 1 ) 


n +1 oo 


E 


( _i )S(s - n)! 


2 H ;r™ +1 “ s!(2s — 2ri)\ 

These series forms, Eqs. (12.116) and (12.120), are useful in two ways: 


( 12 . 120 ) 


• limiting values as x —> 0, 

• closed form representations for n— 0. 


For the special case n — 0 we find from Eq. (12.116) 


i>(a0 = Y 


(-l) s 


x 2s 


sin 


s =o C 2s + !) ! 
whereas for yo, Eq. (12.120) yields 

cos a; 


x 


( 12 . 121 ) 


yo(%) = - 


x 


( 12 . 122 ) 
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From the definition of the spherical Hankel functions [Eq. (12.113)], 
ft® Or) = -(sin a; — i cosir) = — -e“ 

hf\x)= -(simr-h icosx) = -e ~ lx . (12.123) 

Equations (12.121) and (12.122) suggest expressing all spherical Bessel 
functions as combinations of sine, cosine, and inverse powers of x. The appro¬ 
priate combinations can be developed from the power series solutions, Eqs. 
(12.116) and (12.120), but this approach is awkward. We will use recursion 
relations instead. 


Limiting Values 


For x « l, 15 Eqs. (12.116) and (12.120) yield 


3n(x) 

Unix) 


2 n n\ 


X 


-X = 


(2 n+ 1)! 

(_!)«+! 

2 ™ (-2 to )! 

( 2 »)! -™-i 


(2 n+ 1)!! 

C -ny. _ M _! 

-x 


2 n n\ 


x 


= —(2 n— l)!!:r' 


—n— 1 


(12.124) 


(12.125) 


The transformation of factorials in the expressions for y„(x) employs 
Exercise 10.1.3. The limiting values of the spherical Hankel functions go as 
±iy n {x). 

The asymptotic values of j„, y n , ft®, and ft.-, 1 '> may be obtained from the 
Bessel asymptotic forms (Section 12.3). We find 


Jn(x) ~ | sin j, 

Unipc) -COS 


(12.126) 

(12.127) 


pix pi(oc—rm /2) 

ft®(x) ~ (—i) n+1 — = -i ---, (12.128a) 


p—ix p—i(x—mt/2 ) 

ft®Or) ~ i n+1 — = i ---. (12.128b) 

The condition for these spherical Bessel forms is that x n(n + l)/2. From 
these asymptotic values we see that j n {pc) and y n (x) are appropriate for a 
description of standing spherical waves; ft® (x) and ft®Or) correspond to 
traveling spherical waves. If the time dependence for the traveling waves is 
taken to be then ft,® (a;) yields an outgoing traveling spherical wave and 


15 The condition that the second term in the series be negligible compared to the first is actually 
x « 2[(2 n + 2)(2 n + 3)/(w + l)] 1 / 2 for j n (x). 
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tip 1 (x) an incoming wave. Radiation theory in electromagnetism and scattering 
theory in quantum mechanics provide many applications. 


EXAMPLE 12.4.1 


Particle in a Sphere An illustration of the use of the spherical Bessel func¬ 
tions is provided by the problem of a quantum mechanical particle in a sphere 
of radius a. Quantum theory requires that the wave function ijr, describing our 
particle, satisfy 


h 2 

-VV = Exjf, (12.129) 

2m 


and the boundary conditions (i) jr(r < a ) remains finite, (ii) \l/ (a) = 0. This 
corresponds to a square well potential V = 0, r < a, and V = oo, r > a. Here, 
h is Planck’s constant divided by 2 it, m is the mass of our particle, and E is 
its energy. Let us determine the minimum value of the energy for which our 
wave equation has an acceptable solution. Equation (12.129) is Helmholtz’s 
equation with a radial part (compare Example 12.1.4): 


d 2 R 

dr 2 


2 dR 
r dr 


n(n+ 1)' 


R = 0, 


(12.130) 


with k 2 = 2mE/h 2 . Hence, by Eq. (12.111), with n — 0, 


R = AjoQcr ) + Byu(kr). 


We choose the orbital angular momentum index n = 0. Any angular depen¬ 
dence would raise the energy because of the repulsive angular momentum 
barrier [involving n(n+ 1) > 0]. The spherical Neumann function is rejected 
because of its divergent behavior at the origin. To satisfy the second boundary 
condition (for all angles), we require 


ka = - a = a, (12.131) 

h 

where a is a root of jo; that is, jo(a) = 0. This has the effect of limiting the 
allowable energies to a certain discrete set; in other words, application of 
boundary condition (ii) quantizes the energy E. The smallest a is the first zero 
of jo, 


a = 7T 


and 


Err 


jt 2 n 2 
2 ma 2 


h 2 

8ma 2 ’ 


(12.132) 


which means that for any finite sphere the particle energy will have a positive 
minimum or zero point energy. Compare this energy with E = p + p + p) 
of an infinite rectangular square well of lengths a, b, c. This example is an 
illustration of the Heisenberg uncertainty principle for A p ~ hit/a from de 
Broglie’s relation and A r ~ a so that ApAr ~ hj2. ■ 
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Recurrence Relations 


The recurrence relations to which we now turn provide a convenient way of 
developing the higher order spherical Bessel functions. These recurrence rela¬ 
tions may be derived from the series, but as with the modified Bessel functions, 
it is easier to substitute into the known recurrence relations [Eqs. (12.8) and 
(12.21)]. This gives 


2n+ 1 

fn-iQx) + fn+ i(aQ = —-— f n (x), 


X 


nf n -i(oc) - (n+ l)/n+i0*0 = (2 n+ 1 )/^(a). 


(12.133) 


(12.134) 


Rearranging these relations [or substituting into Eqs. (12.1) and (12.6)], we 
obtain 

[x n+1 fn (*)] = x n+l f n - i(ar) (12.135) 


dx 

^-[X~ n fn(x)\ = -X~ n f n +l{x), 
ax 


(12.136) 


where f n may represent j n , y n , ft®, or ft®. 

Specific forms may also be readily obtained from Eqs. (12.133) and (12.134): 

1 1 ' (12.137a) 

(12.137b) 


ft® (a;) = e lx 




•2 / ’ 


ft®(a;) = e i: 


3 i 


ji(x) = 


sin x cos x 


X* 


X 


1 


X 0 rjQ 


ji(x) = - 1 sin a?-- cos a;, 


x* 


mix) = -- 


cos a: sin a: 


x* 


X 


1 


y 2 (x) = - —r-cos a;- ? sina?, 


x* 


(12.138) 


QQO QQ 

and so on. 

By mathematical induction one may establish the Rayleigh formulas 

"sin a; 


(12.139) 


Ux) = (-l) n x n (-^- 
\xdx 

y n (x) = -(-lTx n ( 1 ^- 
\xdx 


X 

cos a: 


x 


(12.140) 

(12.141) 


ft® (a:) = i(- l)"x” ( - — ) (-) ■ 


1 d 


xdx 


x 


(12.142) 
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Numerical Computation 


The spherical Bessel functions are computed using the same techniques 
described in Sections 12.1 and 12.3 for evaluating the Bessel functions. For 
j n (x) it is convenient to use Eq. (12.133) and work downward, as is done for 
J n (x). Normalization is accomplished by comparing with the known forms of 
jo(x), Eqs. (12.121) and Exercise 12.4.11. For y„(x), Eq. (12.120) is used again, 
but this time working upward, starting with the known forms of yw(x), )j\ (x) 
[Eqs. (12.122) and (12.139)]. 


EXAMPLE 12.4.2 


Phase Shifts Here we show that spherical Bessel functions are needed to 
define phase shifts for scattering from a spherically symmetric potential V (r). 
The radial wave function A)(r) = u(r)/r of the Schrodinger equation satisfies 
the ODE 


1 d / 2 d 
r 2 dr I dr 


1(1 + 1 ) 

r 2 


2m 

h 2 


V (r) — k 


2 


Ri(r) = 0, 


(12.143) 


where m is the reduced mass, E = h 2 k 2 /2m the energy eigenvalue, and the 
scattering potential V goes to zero (exponentially) for r —> oo. At large values 
of r, where V is negligible, this ODE is identical to Eqs. (12.111), with the 
general solution a linear combination of the regular and irregular solutions; 
that is, 


Ri(r ) = AijiQcr ) + BiyiQcr), r -* oo. (12.144) 


Using the asymptotic expansions Eqs. (12.126) and (12.127) we have 


sin(fcr — ln/2) cos(fcr - ljt/2) 
Ri(r) ~ Ai -—- Bi 


kr 


kr 


oo. (12.145) 


If there is no scattering, that is, V (r) = 0, then the incident plane wave is our 
solution and B/ = 0 because it has no yi contribution, being finite everywhere. 
Therefore, Bi /Ai = — tan <5; ( k ) is a measure for the amount of scattering at 
momentum hk. 

Because we can rewrite Eq. (12.145) as 


l,(r)~C , Sln( * r ~ i,r/2 + 8,) , 

kr 


Ci 


Ai 

cos Si ’ 


(12.146) 


Si is called the phase shift; it depends on the incident energy. 

We expand the scattering wave function i jr in Legendre polynomials 
Pz(cosd) with the scattering angle 9 defined by the position r of the detec¬ 
tor (Fig. 12.13). Then we compare with the asymptotic expression of the wave 
function 

x//(r) ~ e ikz + /(O ')-— 
r 

by substituting the Rayleigh expansion (Exercise 12.4.21) for the incident plane 
wave and replacing the spherical Bessel functions in this expansion by their 
asymptotic form. For further details, see Griffiths, Introduction to Quantum 
Mechanics, Section 11.2. Prentice-Hall, New York (1994). As a result, one finds 
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SUMMARY 


Figure 12.13 

Incident Plane Wave 
Is Scattered by a 
Potential V into an 
Outgoing Radial 
Wave 



the partial wave expansion of the scattering amplitude as the coefficient of the 
outgoing radial wave, 


i OO 

f(60 = - y&l + l)e ism sin <5,(cos (9). (12.147) 

k iZ o 

Upon integrating ./'(//) | 2 over the scattering angle using the orthogonality of 
Legendre polynomials we obtain the total scattering cross section 


/ A OO 

\f(0\ 2 (m = — V(2 1 + 1) sin 2 Si. (12.148) 

k 1=0 


Bessel functions of integer order are defined by a power series expansion of 
their generating function 

OO 

e (*/2)(«—1/0 _ J n (x)t n . 

n=—o o 

Bessel and Neumann functions are the regular and irregular solutions of 
Bessel’s ODE, which arises in the separation of variables in prominent PDEs, 
such as Laplace’s equation with spherical or cylindrical symmetry and the 
heat and Helmholtz equations. Many of the properties of Bessel functions are 
consequences of the Sturm-Liouville theory of their differential equation. 


EXERCISES 
12.4.1 Show that if 


Vnix) = 



Yn+ll2(?C), 


(-l ) n+1 



J—n— 1/2 Otf)- 


it automatically equals 
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12.4.2 Derive the trigonometric-polynomial forms of j n {z) and y„(z): 


.16 


[m/2] 


*(*) = 


(—1) s (m+ 2s)! 
^ (2s)l(2z) 2s (n— 2s)! 


1 

- cos 
z 


/ ror 
~2~ 


[(m—1)/2] 


(—1) s (tc+ 2s + 1)! 

(2s + 1)!(2s) 2s (tc— 2s — 1)! 


E 


s=0 


, . C- r ) K+1 , 

2/»(s) = -cos | 

2 

(l) n+1 


/ nit \ 

b+v E 


[m/2] 


(-l) s (rc + 2s)! 


sin 


/ wjr \ “ v' 

b + ^-) E 


□f (2s)!(2z) 2s (n — 2s)! 

(—1) s (m+ 2s + 1)! 


s=0 
[(m-l)/2[ 


^ (2s + l)!(2s) 2s+1 (n - 2s - 1)!' 


12.4.3 Use the integral representation of J v (x), 


J v (x) 


I 


^/ 2 (y-i)! 


a: 


J e ±ixp^ _ p2 y 1/2 ^ 


to show that the spherical Bessel functions j n (x) are expressible in 
terms of trigonometric functions; that is, for example, 


jo(x) = 


sin a; 


x 


ji(x) = 


sm a; cos a? 


x* 


x 


12.4.4 (a) Derive the recurrence relations 


2n+ 1 

fn-l(x) + f n +l(x) = - f n (x), 

X 

nfn-lipc) - (n + l)/m+i(a:) = (2 n+ l)X,(x), 

satisfied by the spherical Bessel functions j„(x), y n (x), h^(x), and 
hV\x). 

(b) Show from these two recurrence relations that the spherical Bessel 
function f n (x) satisfies the differential equation 

x 2 fn(x) + ZxfnQx) + W' ~ n(n + !)]/nW = 0. 

12.4.5 Prove by mathematical induction that 


jjx) = (-1 Yx n 


1 d 

xdx 


sin a? 


x 


for n an arbitrary nonnegative integer. 

12.4.6 From the discussion of orthogonality of the spherical Bessel functions, 
show that a Wronskian relation for j n (x) and y„(x) is 

jn(x)y' n (x ) - j! n (x) y n (x) = 

X 


le The upper limit on the summation [n/ 2] means the largest integer that does not exceed ra/2. 
















12.4 Spherical Bessel Functions 


633 


12.4.7 Verify 


2 i 


h { n\x)h^' {pc) - h£ y (x)h%\x) = —,. 


X “ 


12.4.8 Verily Poisson’s integral representation of the spherical Bessel 
function, 


v n pn 


Jn(z) = 


2 n+1 n! 


dJ 


cos(zcos 9 ) sin 2m+1 6 d9. 


12.4.9 Show that 


f 


r [■ t r dx 2 sin[(> - v)jt/2] 

J/Axj'Mx)— = — -- 2 -> M + v > 0. 


x 7 r ^ — v* 


12.4.10 Derive 


12.4.11 Derive 


/ OO 

Jr, 

-OO 


lx')jn(x')dx= 0, 


m^n 
m,n> 0. 


r 


Unix)] dX = 


7r 


2n+ 1 

12.4.12 Set up the orthogonality integral for ji ( kr ) in a sphere of radius R with 
the boundary condition 

ji(kR) = 0. 

The result is used in classifying electromagnetic radiation according 
to its angular momentum l. 

12.4.13 The Fresnel integrals (Fig. 12.14) occurring in diffraction theory are 
given by 

x(t) = / cos(r 2 )dr, y(f) = / sin(i7 2 )dr. 


Show that these integrals may be expanded in series of spherical 
Bessel functions 


x(s) = 


1 f s °° 

~ / j-l(u)u 1/2 du = S 1/2 Jd J 2 n(s), 
z J 0 }i=0 


1 p s °° 

y(s) = - jo(u)u 1/2 du = s 1/2 V J 2 »+iO)- 

1 Jo n=0 

Hint. To establish the equality of the integral and the sum, you may 
wish to work with their derivatives. The spherical Bessel analogs of 
Eqs. (12.7) and (12.21) are helpful. 


12.4.14 A hollow sphere of radius a (Helmholtz resonator) contains standing 
sound waves. Find the minimum frequency of oscillation in terms of 
the radius a and the velocity of sound v. The sound waves satisfy the 
wave equation 


vV 


1 9 2 t/c 
v 2 3 1 2 
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Figure 12.14 
Fresnel Integrals 



and the boundary condition 

9l/r 

— =0, r = a. 
dr 

This is a Neumann boundary condition. Example 12.4.1 has the same 
PDE but with a Dirichlet boundary condition. 

ANS. iw = 0.3313a/a, Vax = 3.018a. 

12.4.15 A quantum particle is trapped in a spherically symmetric well of radius 
a. The Schrodinger equation potential is 


V(r) = 


-Vo, 0 < r < a 

0, r > a. 


The particle’s energy E is negative (an eigenvalue). 

(a) Show that the radial part of the wave function is given by ji (k\ r) 

for 0 < r < a and f"" Qc^r) for r > a. [We require that i// (0) and 
i/r(oo) be finite.] Here, k\ = 2M(E + V 0 )/7r, = —2 ME/ti 2 , and l 

is the angular momentum [n in Eq. (12.111)]. 

(b) The boundary condition at r — a is that the wave function \J/ (r) 
and its first derivative be continuous. Show that this means 


( d/dr)ji(kir ) 


ji(kir) 


(d/dr)j) out (/c2r) 


3 1 far ) 


This equation determines the energy eigenvalues. 

Note. This is a generalization of the deuteron Example 9.1.3. 
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where k is the wave number, k = s /2mE/h, and 3 0 is the scattering 
phase shift. Show that the normalization integral is 



Hint. You can use a sine representation of the Dirac delta function. 
12.4.17 Derive the spherical Bessel function closure relation 



Note. An interesting derivation involving Fourier transforms, the 
Rayleigh plane wave expansion, and spherical harmonics has been 
given by R Ugincius, Am. J. Phys. 40, 1690 (1972). 

12.4.18 The wave function of a particle in a sphere (Example 12.4.2) with 
angular momentum l is i j/(r, 9, (p) = Aji (( J2M E)r/h)YJ"(0 , <p). The 
Y, m (9, <p) is a spherical harmonic, described in Section 11.5. From the 
boundary condition \l/(a,9,cp) = 0 or ji (( sj2ME)a/h) = 0, calculate 
the 10 lowest energy states. Disregard the m degeneracy (21 +1 values 
of mfor each choice of /). Check your results against Maple, Mathe- 
matica, etc. 

Check values. jiWis) — 0 


aoi = 3.1416 
an = 4.4934 
a 2 i = 5.7635 
®o2 = 6.2832. 


12.4.19 Let Example 12.4.1 be modified so that the potential is a finite Vo 
outside (r > a). Use symbolic software (Mathematica, Maple, etc.), 
(a) For E < Vq show that 



(b) The new boundary conditions to be satisfied at r — a are 

lAin (a, 9, (p) = lAoutOb 9, <P~) 

3 3 

— iAm(a, 0, (p ) = — iAout(a, 9, <p) 


or 


1 9lAin 1 SlAout 


V^in dv r=a dr 
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For l — 0 show that the boundary condition at r = a leads to 


KE) = k 


l 

cot ka - 

ka 


■k! 


1 + ¥-J= 0 ' 


where k = V 2ME/h and k! = >/2V/(Vo — E)/h. 

(c) With a — h 1 /Me 2 (Bohr radius) and Vo = 4V/ e 4 /2/r, compute the 
possible bound states (0 < E < V 0 ). 

Hint. Call a root-finding subroutine after you know the approxi¬ 
mate location of the roots of 


f(E) = 0, (0 < E < Vo). 

(d) Show that when a — h 2 /Me 2 the minimum value of V 0 for which 
a bound state exists is Vo = 2.4674V/c 4 /2/r. 


12.4.20 In some nuclear stripping reactions the differential cross section is 
proportional to ji ( x ) 2 , where l is the angular momentum. The location 
of the maximum on the curve of experimental data permits a deter¬ 
mination of l if the location of the (first) maximum of jj (x) is known. 
Compute the location of the first maximum of j\ (x), j 2 (.%'), and j:\(x). 
Note. For better accuracy, look for the first zero of j[ (x). Why is this 
more accurate than direct location of the maximum? 


12.4.21 A plane wave may be expanded in a series of spherical waves by the 
Rayleigh equation 

OO 

gi/crcosy = ^ a n j n (kr)P n (cOS y). 
n=0 

Show that a n = i"(2n + 1). 

Hint. 

• Use the orthogonality of the P n to solve for a n j n (kr). 

• Differentiate n times with respect to kr and set r = 0 to eliminate 
the r dependence. 

• Evaluate the remaining integral by Exercise 11.4.4. 

12.4.22 Verify the Rayleigh expansion of Exercise 12.4.21 by starting with the 
following steps: 

• Differentiate with respect to kr to establish 

fn(.kr)P n ( cos y) = iY x n jjkr) cos yP n { cos y). 

n n 

• Use a recurrence relation to replace cos y P n (cos y ) by a linear 
combination of P,,.. 1 and P n+ \. 

• Use a recurrence relation to replace j' n by a linear combination of 
jn— 1 and jn+i- 

12.4.23 The Legendre polynomials and the spherical Bessel functions are 
related by 

j n {z ) = i(—1)” f e lz cos 6 P n (cos 9~) sin 9 d6, n— 0, 1, 2_ 

2 Jo 
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Verify this relation by transforming the right-hand side into 


2 n+1 n\ J 0 


f 


cos( 2 cos 0 )sin 


2n+l 


Odd 


using Exercise 12.4.21. 


Additional Reading 


McBride, E. B. (1971). Obtaining Generating Functions. Springer-Verlag, New 
York. An introduction to methods of obtaining generating functions. 

Watson, G. N. (1922). A Treatise on the Theory of Bessel Functions. Cambridge 
Univ. Press, Cambridge, UK. 

Watson, G. N. (1952). A Treatise on the Theory of Bessel Functions, 2nd ed. 
Cambridge Univ. Press, Cambridge, UK. This is the definitive text on Bessel 
functions and their properties. Although difficult reading, it is invaluable 
as the ultimate reference. 
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Chapter 13 



Hermite and Laguerre 
Polynomials 


In this chapter we study two sets of orthogonal polynomials, Hermite and 
Laguerre polynomials. These sets are less common in mathematical physics 
than the Legendre and Bessel functions of Chapters 11 and 12, but Hermite 
polynomials occur in solutions of the simple harmonic oscillator of quantum 
mechanics and Laguerre polynomials in wave functions of the hydrogen atom. 

Because the general mathematical techniques are similar to those of the 
preceding two chapters, the development of these functions is only outlined. 
Some detailed proofs, along the lines of Chapters 11 and 12, are left to the 
reader. We start with Hermite polynomials. 


d 


13.1 Hermite Polynomials 


Quantum Mechanical Simple Harmonic Oscillator 


For the physicist, Hermite polynomials are synonymous with the one¬ 
dimensional (i.e., simple) harmonic oscillator of quantum mechanics. For a 
potential energy 


V = \:Kz 2 = \mu> 2 z 2 , 
2 2 


force F z = — 3V/3 z— — Kz, 


the Schrodinger equation of the quantum mechanical system is 


h 2 d 2 
2 mdz 2 


'I'O) + -Kz 2 'V(z) = E^{z). 
2 


(13.1) 


Our oscillating particle has mass m and total energy E. From quantum me¬ 
chanics, we recall that for bound states the boundary conditions 


lim '\>(z) — 0 

2^»-±00 


(13.2) 
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restrict the energy eigenvalue E to a discrete set E„ = X n hca, where o> is the 
angular frequency of the corresponding classical oscillator. It is introduced 
by rescaling the coordinate z in favor of the dimensionless variable x and 
transforming the parameters as follows: 

, mK m 2 a > 2 
x = olz with a = — 5 - = —=—, 
h 2 h 2 

1/2 ( 13 ‘ 3 ) 

h \K) hco 

Eq.(13.1) becomes [with 'P(s) = '\>(x/a) = i/(x)\ the ordinary differential 
equation (ODE) 

+ (2X n - x 2 )V/ re (x) = 0. (13.4) 

dx z 

If we substitute i// re (x) = e~ x “ l2 H n {x) into Eq. (13.4), we obtain the ODE 

K - 2 xH’n + ( 2 a „ - 1 )H n = 0 ( 13 . 5 ) 


of Exercise 8.5.6. A power series solution of Eq. (13.5) shows that H n (x ) will 
behave as e x for large x, unless X n = n+ 1/2, n = 0,1, 2,.... Thus, i//„(.r) and 
'i’nQs) will blow up at infinity, and it will be impossible for the wave function 
O (z) to satisfy the boundary conditions [Eq. (13.2)] unless 


Eyi — X^Ttco — 



n — 0, 1, 2... 


(13.6) 


This is the key property of the harmonic oscillator spectrum. We see that the 
energy is quantized and that there is a minimum or zero point energy 

E m \ n = Eq — -ha>. 

This zero point energy is an aspect of the uncertainty principle, a genuine 
quantum phenomenon. Also, with 2X n —1 = 2 n, Eq. (13.5) becomes Hermite’s 
ODE and H n (x) are the Hermite polynomials. The solutions //„ (Fig. 13.1) of 
Eq. (13.4) are proportional to the Hermite polynomials 1 //',,/^). 

This is the differential equations approach, a standard quantum mechani¬ 
cal treatment. However, we shall prove these statements next employing the 
method of ladder operators. 


Raising and Lowering Operators 


The following development is analogous to the use of the raising and lowering 
operators for angular momentum operators presented in Section 4.3. The key 
aspect of Eq. (13.4) is that its Hamiltonian 


-2 H = 


d 2 2 

( d \ 

( d \ 


' d ' 


\dx X ) 


+ 

x > T~ 
dx_ 


(13.7) 


^ote the absence of a superscript, which distinguishes Hermite polynomials from the unrelated 
Hankel functions in Chapter 12. 
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Figure 13.1 

Quantum 
Mechanical 
Oscillator Wave 
Functions. The 
Heavy Bar on the 
.r-Axis Indicates the 
Allowed Range of 
the Classical 
Oscillator with the 
Same Total Energy 



almost factorizes. Using naively a 2 — b 2 = (a — b)(a + b), the basic commutator 
[p x , x] —h/i of quantum mechanics [with momentum p x = Qi/i)d/dx] enters 
as a correction in Eq. (13.7). [Because p x is Hermitian, d/dx is anti-Hermitian, 
(d /dx)' = —d/dx.] This commutator can be evaluated as follows. Imagine 
the differential operator d/dx acts on a wave function i// (x) to the right, as in 
Eq. (13.4), so that 

-^-(an/r) = x-^r/t + f (13.8) 

dx dx 

by the product rule. Dropping the wave function \jr from Eq. (13.8), we rewrite 
Eq. (13.8) as 


d d 

dx dx 



(13.9) 


a constant, and then verify Eq. (13.7) directly by expanding the product. The 
product form of Eq. (13.7), up to the constant commutator, suggests introduc¬ 
ing the non-Hermitian operators 


a) 


I , d 

V2 V dx 


1 / d 
a = — x+ — 
V2 V dx 


with (a)t = d'. They obey the commutation relations 

d 


[a, o'] = 


dx’ 


x 


= 1, [a, a] = 0 = [a 1 , d^], 


(13.10) 


(13.11) 
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which are characteristic of these operators and straightforward to derive 
from Eq. (13.9) and 

[d/dx, d/dx] = 0 = [x, x] and [ x , d/dx\ = —[d/dx, x]. 

Returning to Eq. (13.7) and using Eq. (13.10), we rewrite the Hamiltonian as 

7i — a)a + ^ = a)a + i[a, a'] = ^( a)a + aa)) (13.12) 

Ci Ci L j 

and introduce the Hermitian number operator N = d'd so that 'H = N+ 1 /2. 
We also use the simpler notation i fr n = \n) so that Eq. (13.4) becomes 


H\n) = X n \n). 


Now we prove the key property that N has nonnegative integer eigenvalues 

\n)=n\n), n— 0,1,2,..., (13.13) 

that is, X n = n + 1/2. From 

|d|ra)| 2 = {n\a)d\n) = - ^ > 0, (13.14) 

we see that N has nonnegative eigenvalues. 

We now show that the commutation relations 

[N, d f ] = d\ [N, a] = -a, (13.15) 



which follow from Eq. (13.11), characterize N as the number operator that 
counts the oscillator quanta in the eigenstate | n). To this end, we determine 
the eigenvalue of N for the states d' \n) and d|n). Using da’ = N + 1, we see 
that 


N(d^\n)) = a)(N+ l)|n> = (\ n + = (n+ l^lw), (13.16) 


N(ajn » = (ad T — l)d|w) = a(N — 1)| n) = \X n — - )a\n) 
= (n— l)d|n>. 


1 


In other words, N acting on d' | n) shows that has raised the eigenvalue n 
of | n) by one unit; hence its name raising or creation operator. Applying 
a) repeatedly, we can reach all higher excitations. There is no upper limit to 
the sequence of eigenvalues. Similarly, a lowers the eigenvalue n by one unit; 
hence, it is a lowering or annihilation operator. Therefore, 

d t |n)~|n+l), a\n)~\n—l). (13.17) 


Applying a repeatedly, we can reach the lowest or ground state |0) with eigen¬ 
value /.(). We cannot step lower because a 0 > 1/2. Therefore, d|0) = 0, 
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suggesting we construct i// 0 = |0> from the (factored) first-order ODE 


V2a\l/ 0 = (+ x]t/fo = 0. 
\ax ) 


Integrating 


we obtain 


to 

0 


= —x, 


In i//() = ~-x 2 +lnc 0 , 

Li 

where Co is an integration constant. The solution 

fo(pc) = c 0 e~ x2/2 


(13.18) 


(13.19) 


(13.20) 


(13.21) 


can be normalized, with Co = n 1/4 using the error integral. Substituting //o 
into Eq. (13.4) we find 

i)i°> = |io> (13-22) 

so that its energy eigenvalue is Xq = 1/2 and its number eigenvalue is n = 0, 
confirming the notation |0). Applying a' repeatedly to i//o = |0), all other eigen¬ 
values are confirmed to be X n = n+ 1/2, proving Eq. (13.13). The normaliza¬ 
tions in Eq. (13.17) follow from Eqs. (13.14), (13.16), and 

| a)\n)\ 2 = {n\aa)\n) = (nl^d-t- \\n) = n+ 1, (13.23) 



showing 

Vw+ 1| n+ 1) = a)\n), ^/n\n— 1) = a\n). (13.24) 

Thus, the excited-state wave functions, i/ri, 1 // 2 , and so on, are generated by the 
raising operator 

111 = s ' |0) = 71 (*" s) <- 13 ^ 

yielding [and leading to Eq. (13.5)] 

tnix) = N n H n (x) e -* 2 f\ Nn = „-W(2rnZ)-W t (13.26) 


where H n are the Hermite polynomials (Fig. 13.2). 


Biographical Data 

Hermite, Charles. Hermite, a French mathematician, was bom in Dieuze 
in 1822 and died in Paris in 1902. His most famous result is the first proof that 
e is a transcendental number; that is, e is not the root of any polynomial with 
integer coefficients. He also contributed to elliptic and modular functions. 
Having been recognized slowly, he became a professor at the Sorbonne in 
1870. 
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Figure 13.2 
Hermite Polynomials 



Recurrence Relations and Generating Function 


Now we can establish the recurrence relations 


2 xH n (x) - Kix) = H n+l (x), Htfx) = 2nH n _ 1 (x), (13.27) 

from which 

H n+ i(x) = 2xH n (x) — 2nH n _\(x) (13.28) 

follows by adding them. To prove Eq. (13.27), we apply 

x- — = -e x * 12 —e~ x2 ' 2 (13.29) 

dx dx 

to \jr n (x) of Eq. (13.26) and recall Eq. (13.24) to find 

N n+1 H n+1 e~* 2 ' 2 = - , Nn e~ x2 '\-2xH n + H' n ), (13.30) 

V2 (n+ 1) 

that is, the first part of Eq. (13.27). Using x + d/dx instead, we get the second 
half of Eq. (13.27). 


EXAMPLE 13.1.1 


The First Few Hermite Polynomials We expect the first Hermite polyno¬ 
mial to be a constant, Hq(x) = 1, being normalized. Then n = 0 in the recursion 
relation [Eq. (13.28)] yields H\ = 2xH tl = 2x; n — 1 implies H> = 2xH\ — 2H§ — 
4a: 2 — 2, and n = 2 implies 


H 3 (x) = 2xH-2.{x') - 4/7] (x) — 2a:(4a; 2 - 2) - 8x = 8a: 3 - 12a:. 


Comparing with Eq. (13.27) for n = 0,1, 2, we verify that our results are 
consistent: 2xH 0 — IT f) — ll\ = 2a; ■ L — 0, etc. For convenient reference, the 
first several Hermite polynomials are listed in Table 13.1. ■ 
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Table 13.1 
Hermite Polynomials 


EXAMPLE 13.1.2 


EXAMPLE 13.1.3 



The Hermite polynomials H n (x) may be summed to yield the generating 
function 


°° t 71 

g(x, 0 = e~ t2+2tx = Y Unix)-., (13.31) 

to n ' 

which we derive next from the recursion relation [Eq. (13.28)]: 

°° t n do 

0 “ X! n\^ H " +l ~ 2:rH " + 2nH n-i) = — ~ 2 xg + 2 tg. (13.32) 

n= 1 

Integrating this ODE in t by separating the variables t. and g, we get 

= (13.33) 

9 3 1 

which yields 

In g = 2xt - t 2 + In c, g(x, t) — e~^ +2xt c(x), (13.34) 


where c is an integration constant that may depend on the parameter x. Direct 
expansion of the exponential in Eq. (13.34) gives Hu(x) = 1 and H] (x) = 2x 
along with c(x) = 1. 


Special Values Special values of the Hermite polynomials follow from the 
generating function for x = 0; that is, 

OO +n o o +2 n 

g{x = 0 ,t) = e~ t2 = £ H n (0) = ^(-1)"—. 

to n! to n! 

A comparison of coefficients of these power series yields 

(2nV 

H‘2n(0) = (-1 H 2n+ m = 0. (13.35) 

n\ 


Parity Similarly, we obtain from the generating function identity 
g(-x, 0 = e- t2 ~ 2tx = g(x, — 0 
the power series identity 


^ t n 

Y,Hn(~x)- 
to n - 




(- 0 ™ 


n=0 


(13.36) 


n\ 
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which implies the important parity relation 

H n (x) = (-irH n (-x). (13.37) 


In quantum mechanical problems, particularly in molecular spectrosopy, a 
number of integrals of the form 

r°° 2 

/ x r e~ x H n (x)H m (x)dx 

J — OO 

are needed. Examples for r — 1 and r — 2 (with n—rri) are included in the ex¬ 
ercises at the end of this section. Many other examples are contained in Wilson 
et al. 2 The oscillator potential has also been employed extensively in calcula¬ 
tions of nuclear structure (nuclear shell model) and quark models of hadrons. 

There is a second independent solution to Eq. (13.4). This Hermite function 
is an infinite series (Sections 8.5 and 8.6) and of no physical interest yet. 


Alternate Representations 


Differentiation of the generating function 3 n times with respect to t and then 
setting t equal to zero yields 


H n (x) = (- IjV 2 —e ~ x *. (13.38) 

dx n 

This gives us a Rodrigues representation of H n (x). A second representation 
may be obtained by using the calculus of residues (Chapter 7). If we multiply 
Eq. (13.31) by t~ m ~ l and integrate around the origin, only the term with H m (x ) 
will survive: 

H m (x) =^-<t r m - l e- ,2+2tx dt. (13.39) 

2 Ttl J 


Also, from Eq. (13.31) we may write our Hermite polynomial H n (x ) in series 
form: 


2 y) I 1 

Hn 0*0 = (2 XT - 7 - -r(2a;)™ -2 + -- —(2x^1 ■ 3 ■ ■ ■ 

(n— 2)!2! (n— 4)!4! 


In/ 2] 

= ^](-2) s (2a;) n - 2s 

s=0 



1.3.5... (2 S - 1) 


[»/ 2 ] 

= J](-l) s (2a;) n - 2s 

s=0 


n\ 

(n— 2s)!s! ’ 


(13.40) 


This series terminates for integral n and yields our Hermite polynomial. 


2 Wilson, E. B., Jr., Decius, J. C., and Cross, P. C. (1955). Molecular Vibrations. McGraw-Hill, 
New York. Reprinted, Dover, New York (1980). 

3 Rewrite the generating function as g(pc, t) = e x . Note that 

L e -«-xf _ _ 3 -C t-xf 
3 1 ~ dx 
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EXAMPLE 13.1.4 


The Lowest Hermite Polynomials For n — 0, we find from the series, 
Eq. (13.40), Ho(x) = (-l)°(2;r) 0 ^ = 1; for n = 1, H x (x) = (-1)°(2^) 1 ^ = 
2.x;; and for n — 2, the s — 0 and s = 1 terms in Eq. (13.40) give 


H 2 (x) = (-lf(2x) 2 ^ - (2*)°^ = 4x 2 - 2, 


etc. ■ 


P Orthogonality 

The recurrence relations [Eqs. (13.27) and (13.28)] lead to the second-order 
ODE 


H” (x) - 2 xH' n (x) + 2nH n (x) = 0, (13.41) 


which is clearly not self-adjoint. 

To put Eq. (13.41) in self-adjoint form, we multiply by exp(— x 2 ) (Exercise 
9.1.2). This leads to the orthogonality integral 



H m (x)H n (x)e x2 dx = 0, 


m^n, 


(13.42) 


with the weighting function exp(— x 2 ) a consequence of putting the differen¬ 
tial equation into self-adjoint form. The interval (— oo, oo) is dictated by the 
boundary conditions of the harmonic oscillator, which are consistent with the 
Hermitian operator boundary conditions (Section 9.1). It is sometimes conve¬ 
nient to absorb the weighting function into the Hermite polynomials. We may 
define 


(p n (oc) = e x2/2 H n (x), 


(13.43) 


with (p„(x) no longer a polynomial. 

Substituting H n = e x 1, 2 (p n into Eq. (13.41) yields the quantum mechanical 
harmonic oscillator ODE [Eq. (13.4)] for i p n (x): 

(p'n(x) + (2n + 1 - x 2 )(p„(x) = 0, (13.44) 

which is self-adjoint. Thus, its solutions q) n (x) are orthogonal for the interval 
(— oo < x < oo) with a unit weighting function. The problem of normalizing 
these functions remains. Proceeding as in Section 11.3, we multiply Eq. (13.31) 
by itself and then by e~ x . This yields 


g— x 2 g—s 2 +2sXg—t 2 +2tx 


E 


m,n= 0 


e~ x2 H m (x)H n (x) 


s m t n 

mini 


(13.45) 
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SUMMARY 


When we integrate over x from —oo to +oo the cross terms of the double sum 
drop out because of the orthogonality property 4 

oo C*+\n poo 


y (str r 

n\n\ J_ c 

f 


e" x [H n (x)] 2 dx = 


/ 


g—x 2 —s 2 +2sx— t 2 +2tx 


00 


00 9 nr Q f\n 

2 e 2st dx=7T 1 ' 2 e 2st = 7r 1 ' 2 J2^^. (13.46) 


n =0 


n\ 


By equating coefficients of like powers of st , we obtain 

/ OO 

e~ x2 [H n (x)] 2 dx = 2 n Tt l/2 n\ 

-OO 

which yields the normalization N n in Eq. (13.26). 


(13.47) 


Hermite polynomials are solutions of the simple harmonic oscillator of quan¬ 
tum mechanics. Their properties directly follow from writing their ODE as a 
product of creation and annihilation operators and the Sturm-Liouville theory 
of their ODE. 


EXERCISES 


13.1. In developing the properties of the Hermite polynomials, start at a 
number of different points, such as 

(1) Hermite’s ODE, Eqs. (13.5) and (13.44); 

(2) Rodrigues’s formula, Eq. (13.38); 

(3) Integral representation, Eq. (13.39); 

(4) Generating function, Eq. (13.31); and 

(5) Gram-Schmidt construction of a complete set of orthogonal poly¬ 
nomials over (— 00 , 00 ) with a weighting factor of exp(— x 2 ), 
Section 9.3. 

Outline how you can go from any one of these starting points to all 
the other points. 


13.1.2 From the generating function, show that 


[n/2\ 

Hrlx) = J](-1) S 

s=0 


n\ 


,(2x) 


n—2s 


( n— 2s)!s! 

13.1.3 From the generating function, derive the recurrence relations 

H n+ x(x ) = 2xH ri (x) - 2nH n _ 1 (x), 


= 2nH n _i(x). 


4 The cross terms (m ^ ri) may be left in, if desired. Then, when the coefficients of s“ are equated, 

the orthogonality will be apparent. 
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13.1.4 Prove that 

(lx - 1 = H n {x). 

Hint. Check out the first couple of examples and then use mathemat¬ 
ical induction. 

13.1.5 Prove that 


\H n {pc)\ < \H n (ix)\. 

13.1.6 Rewrite the series fonn of H n (x) [Eq. (13.40)] as an ascending power 
series. 


ANS. H 2n (x) = (-1)™ £(-1) 2s (2x> 


,2s 


(2 n)\ 


s=0 


(2s)!(n-s)!’ 


H 2 n+x(x) = (-l) ?i £(-l) S (2^) : 


s=0 


2 S +i (2 to+ 1)! 

(2s + \y.(n — s)! 


13.1.7 (a) Expand x 2r in a series of even-order Hermite polynomials, 
(b) Expand x 2r+1 in a series of odd-order Hermite polynomials. 

ANS. 


2 lr ^mrfrr-ni: 


2r+l @ r + 1)! 

(b) x = 


E 


H 2n +i(x ) 


2 2r+1 ^ (2n+l)!(r - »)! 


, r = 0,1, 2, .... 


Hint. Use a Rodrigues representation of H 2n (%) and integrate by parts. 
13.1.8 Show that 

„ I 2jtn\/(n/2)\, neve n 
r—^ 2 /2icfe 

0, n odd. 


/ 


(a) H n (x) exp[-ar/2 ]dx < 


/ 


(b) xH n (x ) exp[— x 2 /2\dx ■ 


0, n even 

(n+ 1)! 

Ztt -, nodd. 

((n + l)/2)! 


13.1.9 Show that 

f°° _ 2 

I x m e H n (x) dx—0 for man integer, 0 < m < n — 1. 

J — OO 

13.1.10 The transition probability between two oscillator states, mand n, de¬ 
pends on 


f 


xe x ~ Hnipc^H^x^dx. 











13.1 Hermite Polynomials 


649 


Show that this integral equals jt 1/z 2 1l ~ 1 n\8 mtn -i + jt 1/2 2 n {n+ l)!<$ m , TC +i- 
This result shows that such transitions can occur only between states 
of adjacent energy levels, m = n ± 1. 

Hint. Multiply the generating function [Eq. (13.31)] by itself using two 
different sets of variables (x, s) and {x, i). Alternatively, the factor x 
may be eliminated by a recurrence relation in Eq. (13.27). 


13.1.11 Show that 



—OO 


This integral occurs in the calculation of the mean square displace¬ 
ment of our quantum oscillator. 

Hint. Use a recurrence relation Eq. (13.27) and the orthogonality 
integral. 

13.1.12 Evaluate 



in terms of n and m and appropriate Kronecker delta functions. 


ANS. 2 n ~ 1 7T 1/2 {2n+ l)re!S„, m + 2 n n 1/2 {n+ 2)!5 m+2 , m + 2 n ~ 2 n ll2 n\K-i, m . 
13.1.13 Show that 



n, p, and r are nonnegative integers. 

Hint. Use a recurrence relation, Eq. (13.27), p times. 

13.1.14 (a) Using the Cauchy integral formula, develop an integral represen¬ 


tation of H n {x ) based on Eq. (13.31) with the contour enclosing 
the point z — —x. 



(b) Show by direct substitution that this result satisfies the Hermite 
equation. 

13.1.15 (a) Verily the operator identity 


x 



(b) The normalized simple harmonic oscillator wave function is 

i/ n (x) = (jr 1/2 2 K w!)- 1/2 exp[-x 2 /2]H n {x). 





650 


Chapter 13 Hermite and Laguerre Polynomials 


Show that this may be written as 

/ ft \ n 

i !f n (?c) = 0r 1/2 2™n!) -1/2 ( x -) exp[-:r 2 /2]. 

\ dx) 

Note. This corresponds to an refold application of the raising operator. 

13.1.16 Write a program that will generate the coefficients a s in the polynomial 
form of the Hermite polynomial, //„ ('.*') = a s% s - 

13.1.17 A function f(x ) is expanded in a Hermite series: 

OO 

f{oc) = ^2 a nH n (x). 
n= 0 

From the orthogonality and normalization of the Hermite polynomials 
the coefficient a „ is given by 

1 c°° 

°» = ¥^>. I « fW H -W e ~ x2 dx - 

For f{pc ) = a; 8 , determine the Hermite coefficients a n by the Gauss- 
Hermite quadrature. Check your coefficients against AMS-55, Table 
22 . 12 . 

13.1.18 Calculate and tabulate the normalized linear oscillator wave functions 

f n {x) = 2 _H/2 7r _1/,4 (n!) _1/2 fT„(£r) exp(— x 2 /Z) for x = 0.0 to 5.0 
in steps of 0.1 and n— 0, 1,..., 5. Plot your results. 

13.1.19 Consider two harmonic oscillators that are interacting through a po¬ 
tential V — cx 1 X 2 , \c\ < mco 2 , where X\ and X 2 are the oscillator 
variables, m is the common mass, and w s the common oscillator fre¬ 
quency. Find the exact energy levels. If c > mco 2 , sketch the potential 
surface Vixi, xf) and explain why there is no ground state in this case. 


13.2 Laguerre Functions 


If we start with the appropriate generating function, it is possible to develop the 
Laguerre polynomials in analogy with the Hermite polynomials. Alternatively, 
a series solution may be developed by the methods of Section 8.5. Instead, to 
illustrate a different technique, let us start with Laguerre’s ODE and obtain a 
solution in the form of a contour integral. From this integral representation a 
generating function will be derived. We want to use Laguerre’s ODE 

xy"(x) + (1 - x)y f ix) + nyix) — 0 (13.48) 

over the interval 0 < x < 00 and for integer n > 0, which is motivated by 
the radial ODE of Schrodinger’s partial differential equation for the hydrogen 
atom. 


Differential Equation—Laguerre Polynomials 
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We shall attempt to represent y, or rather y n since y will depend on n, by 
the contour integral 


Vn(x) 


1 r e -xz/(l-z) 


2ni J (1 — z)z n+1 


dz, 


(13.49) 


and demonstrate that it satisfies Laguerre’s ODE. The contour includes the 
origin but does not enclose the point z— 1. By differentiating the exponential 
in Eq. (13.49), we obtain 


y'n 0*0 




e -xz/(l-z ) 


2 jti J (1 — z) 2 z‘ 


-dz, 

i 7 


(13.50) 


y'n( X ) 


= — i 


e -xz/(l-z) 


rdz. 


2nij (1 — zfz n ~ l 

Substituting into the left-hand side of Eq (13.48), we obtain 

x 1 — x n 




(1 — zfz n ~ l (1 — z) 2 z n (1 — z)z n+l 


(13.51) 


e -xz/(\-z) d z 


which is equal to 


1 A 

f d 

" e 

xz/(\—z) - 

2ni J 

dz 

La 

— z)z n _ 


(13.52) 


If we integrate our perfect differential around a contour chosen so that the final 
value equals the initial value (Fig. 13.3), the integral will vanish, thus verifying 
that yn(x) [Eq. (13.49)] is a solution of Laguerre’s equation. We also see how 
the coefficients of Laguerre’s ODE [Eq. (13.48)] determine the exponent of the 


Figure 13.3 

Laguerre Function 
Contour 


y 
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Figure 13.4 

Laguerre 

Polynomials 



generating function by comparing with Eq. (13.52), thus making Eq. (13.49) 
perhaps a bit less of a lucky guess. 

It has become customary to define L n (x), the Laguerre polynomial 
(Fig. 13.4), by 5 


L n (pc) 


= J-df 


e -xz/(l-z) 


dz. 


2ni / (1 — z)z 1l+l 
This is exactly what we would obtain from the series 

e -a»/(l-a) 


g(x, z) = 


1 


= ^L n (x')z n , |z| < 1 


(13.53) 


(13.54) 


n =0 


if we multiplied it by z~ n ~ x and integrated around the origin. Applying the 
residue theorem (Section 7.2), only the z~ x term in the series survives. On 
this basis we identify g(x, z) as the generating function for the Laguerre 
polynomials. 

With the transformation 


s — X 

= s — x or z= -, 

(13.55) 

s 

e x C s n e s 

:) "2 „i? is-XT* dS ’ 

(13.56) 


the new contour enclosing the point s = x in the s-plane. By Cauchy’s integral 
formula (for derivatives) 


L n(x) = (integral n), (13.57) 

n\ ax n 


B Other notations of L n (x) are in use. Here, the definitions of the Laguerre polynomial L n {pc ) and 
the associated Laguerre polynomial L^(x) agree with AMS-55 (Chapter 22). 
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EXAMPLE 13.2.1 


Table 13.2 

Laguerre Polynomials 


EXAMPLE 13.2.2 


giving Rodrigues’s formula for Laguerre polynomials. From these representa¬ 
tions of L,i(x) we find the series form (for integral n), 


L n (pc) — 


(-17 


n\ 


'n n T n-y . nZ (n-iy 2 

X x -t- 2J x 


+ (—l) n w! 


(-1 ) m n\x m 
ntx) ( n — ™)!m!m! 


= E 


= £ 


(-i) 


< n s n\x n s 


s)!(n— s)!s! 


(13.58) 


s=0 


Lowest Laguerre Polynomials Forn = 0, Eq. (13.57) yields L 0 = % (x 0 e~ x ) 
= 1; for n = 1, we get L\ = e’ (xe^’ ) = e x (e~ x — xe~ x ) = 1 — x; and for 
n— 2, 

L 2 = le x -f^(x 2 e~ x ) = \e x ^-(2xe~ x - x 2 e ~ x ) 

Z CLDC Z CLDC 

= -e x (2 — 2x—2x + x 2 )e~ x — 1 — 2a; + -x 2 , 

2 2 

etc., the specific polynomials listed in Table 13.2 (Exercise 13.2.1). ■ 


LoO) = 1 

L\(x) = — x + 1 

2\L2(x) = x 2 — 4a; + 2 

3\Ls(x) = —x 3 + 9a; 2 - 18a; + 6 

4!Z,4(x) = x 4 - 16a; 3 + 72x 2 - 96a; + 24 

5\Lg(x) = -x B + 25x 4 - 200x 3 + 600a; 2 - 600a; + 120 

6!L 6 (x) = x 6 - 36x 6 + 450x 4 - 2400x 3 + 6400x 2 - 4320x + 720 


Recursion Relation Carrying out the innermost differentiation in Eq. (13.57), 
we find 


L n (x) = 


(f d n ~ l 


e x d n ~ l 


n -Dnx H - 1 - x n )e- x = L n -i(x) - „ , 

n\ dx n ~ L n\ dx n ~ l 

a formula from which we can derive the recursion L' n = L' n _ y — L n _ j. To 

generate L n from the last term in the previous equation, we multiply it by e~ x 

and differentiate it, getting 

d d i d n 

ie- x L,„{x)) = ie- x L n _,(x)) - -—(x n e~ x ), 
dx dx n\ dx n 

which can also be written as 

1 d n 

e~\~L n + L' n + L n -i - L' n _0 -^^ n e~ x r 

Multiplying this result by e x and using the definition [Eq. (13.57)] yields 

L'nipc) = L'^ix) - L n -l(x). 


(13.59) 
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By differentiating the generating function in Eq. (13.54) with respect to x 
we can rederive this recursion relation as follows: 

(1 — 2)— =- — —zg(x, z). 

dx 1 — 2 

Expanding this result according to the definition of the generating function, 
Eq. (13.54), yields 

OO OO 

E L 'n^ n - = E^» - L 'n-^ n = - E L 

n n =0 n =1 

which implies the recursion relation [Eq. (13.59)]. 

By differentiating the generating function in Eq. (13.54) with respect to x 
and z, we obtain other recurrence relations 

(n+ l)L n+ i(_x) = (2n+ 1 - x)L n (x) - nL„_\(x), 

xL' n (x) — nL n (x ) — nL n _\(x). (13.60) 

Equation (13.60), modified to read 

Ln+ 1 (x) — 2L n (x) It n —i(oc) 

- [(1 + x)L n (x) - L„_i(a;)]/(n + 1) (13.61) 

for reasons of economy and numerical stability, is used for computation of 
numerical values of L n (pc). The computer starts with known numerical values 
of L()(x) and L ] (x) (Table 13.2) and works up step by step. This is the same 
technique discussed for computing Legendre polynomials in Section 11.2. 


EXAMPLE 13.2.3 


Special Values From Eq. (13.54) for x = 0 and arbitrary z, we find 

-j OO OO 

0(0, z)=- -= T>™=E 

1 - s to to 


and, therefore, the special values 


L n { 0) = 1, 


(13.62) 


by comparing coefficients of both power series. ■ 


As is seen from the form of the generating function, the form of Laguerre’s 
ODE, or from Table 13.2, the Laguerre polynomials have neither odd nor even 
symmetry (parity). 

The Laguerre ODE is not self-adjoint and the Laguerre polynomials, L n (x), 
do not by themselves form an orthogonal set. However, following the method 
of Section 9.1, if we multiply Eq. (13.48) by e~ x (Exercise 9.1.1) we obtain 


c L»,(.x) L „ (x)(]*x — <i m n ■ (13.63) 

This orthogonality is a consequence of the Sturm-Liouville theory (Section 
9.1). The normalization follows from the generating function. It is sometimes 
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convenient to define orthogonal Laguerre functions (with unit weighting func¬ 
tion) by 

P»(aO = e~ x/2 L n (x). (13.64) 


EXAMPLE 13.2.4 


Special Integrals Setting z = 1/2 in the generating function, Eq. (13.54), 
gives the relation 

ff( x > = 2e ~ x = jrL n (_x) 2 “™. 

\ n=0 

Multiplying by e~ x L m (x ), integrating, and using orthogonality yields 

/»oo 

2 1 e~ 2x L m (x)dx = 2~ m . 

Jo 

Setting z — —1/2, ±1/3, etc., we can derive numerous other special integrals. 


Our new orthonormal function (p n {pc) satisfies the ODE 

xq>l{x) + <p' n (x) + (n+ 1 - \<p n (x) = 0, (13.65) 

which is seen to have the (self-adjoint) Sturm-Liouville form. Note that it is 
the boundary conditions in the Sturm-Liouville theory that fix our interval as 

(0 < x < oo). 


Biographical Data 

Laguerre, Edmond Nicolas. Laguerre, a French mathematician, was born 
in 1834 and died in Bar-le-Duc in 1886. He contributed to continued fractions 
and the theory of algebraic equations, and he was one of the founders of 
modern axiomatic geometry. 


Associated Laguerre Polynomials 


In many applications, particularly the hydrogen atom wave functions in quan¬ 
tum theory, we also need the associated Laguerre polynomials as in Exam¬ 
ple 13.2.5 defined by 6 


L k n (x) = (-If 


d k 

dx k 


; /c(.r). 


From the series form of L n (x), 


(13.66) 


L k 0 {x) = 1 

L\(x) = —x+k + 1 
L\{x) = y - (k + 2)x + 


(k -(- 2)(/c + 1) 
2 


(13.67) 


®Some authors use C k +k (x) = (d k /dx k )[L n+k (x)\. Hence ourL*(x) = (-1 ) k C k +k (pc). 
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In general, 


Li(x) = £(-i r 


(n + k~)\ 


m =0 


(n — m)\(k + m)!m! 


x" 


k > —1. 


(13.68) 


A generating function may be developed by differentiating the Laguerre gen¬ 
erating function k times. Adjusting the index to L n +k, we obtain 


e -xz/{l-z) 


= ^L k n (x')z n , N < 1. 


(! - z? +1 n=0 

From this, for x = 0, the binomial expansion yields 

(n + k)\ 


K(0) = 


n\k\ 


(13.69) 


(13.70) 


Recurrence relations can easily be derived from the generating function or 
by differentiating the Laguerre polynomial recurrence relations. Among the 
numerous possibilities are 


(n + l)L h n+] (x) = (2n + k + 1 - x)L\(x) - (n+ k^L^x), (13.71) 

cLL^ C 

x —^— = nL k n (x) — {n-\-k)L k n _ l (x). (13.72) 

From these or from differentiating Laguerre’s ODE k times, we have the asso¬ 
ciated Laguerre ODE 

x rf^W +@+ i_ a]) <IL|W +<w = 0 

dx* dx 

When associated Laguerre polynomials appear in a physical problem, it is 
usually because that physical problem involves the ODE [Eq. (13.73)]. 

A Rodrigues representation of the associated Laguerre polynomial is 

pX rp tC /Jfl 

L k (x) = —- - de~ x x n+k y (13.74) 

n n\ dx n 

Note that all these formulas for g(x~) reduce to the corresponding expressions 
for L n (pc), when k — 0. 

The associated Laguerre equation [Eq. (13.73)] is not self-adjoint, but it 
can be put in self-adjoint form by multiplying by e~ x x k , which becomes the 
weighting function (Section 9.1). We obtain 

[ e^ x x k L h n (x)L k m (x)dx = ^ + k) ' 8 mm (13.75) 

Jo n\ 

which shows the same orthogonality interval [0, oo) as that for the Laguerre 
polynomials. However, with a new weighting function, we have a new set of 
orthogonal polynomials, the associated Laguerre polynomials. 
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By letting if k (pc) — e x,2 x k/2 L k n (x), i/ k (x:) satisfies the self-adjoint equation 

*^ + ^ + (-i + ?L T ±i -£)**«- 0 - <1376) 

The if 1 ;, (x) are sometimes called Laguerre functions. Equation (13.65) is the 
special case k — 0. 

A further useful form is given by defining 7 


<E>*(ar) = e x/2 x^ k+l ^ /2 L k n (x). 

Substitution into the associated Laguerre equation yields 
d 2 <& k (x) ( l 2n+k +1 k 2 -1 


dx 2 V 4 2x 4x 2 

The corresponding normalization integral is 


= o. 


f 


e X x k+1 [L*(tr)]“ dx= (2n + k + 1). 


(13.77) 


(13.78) 


(13.79) 


Notice that the k n (x ) do not form an orthogonal set (except with x 1 as a 
weighting function) because of the x~ x in the term (2 n + k + l)/2.r. 


EXAMPLE 13.2.5 


The Hydrogen Atom The most important application of the Laguerre poly¬ 
nomials is in the solution of the Schrodinger equation for the hydrogen atom. 
This equation is 


h 2 2 Ze 2 

~ — V 2 ir - - if = Ef, 

2 to 4jre 0 ^ 


(13.80) 


where Z = 1 for hydrogen, 2 for singly ionized helium, and so on. Separating 
variables, we find that the angular dependence of i/r is on the spherical har¬ 
monics Y^(0, (p) (see Section 11.5). The radial part, R(r ), satisfies the equation 


I 2 Id/ 2 dR\ Ze 2 „ h 2 L{L + 1) 

o 2 Tt - ( r “T - ) i- ^ T o - 2 - ^ — ER. 

2m r 2 dr \ dr J ^ite^r 2m r 2 


(13.81) 


For bound states R —> 0 as r -* oo, and R is finite at the origin, r = 0. These 
are the boundary conditions. We ignore the continuum states with positive 
energy. Only when the latter are included do the hydrogen wave functions 
form a complete set. By use of the abbreviations (resulting from rescaling r to 
the dimensionless radial variable p) 


p — fir with 


8 mE 


E < 0, 


mZe 2 

2Tte 0 fih 2 ’ 


(13.82) 


7 This corresponds to modifying the function i/r in Eq. (13.76) to eliminate the first derivative 
(compare Exercise 8.6.3). 
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SUMMARY 


Eq. (13.81) becomes 

2 dxxpy\ 

p 2 dp\ dp J 



1 

4 


L(L+ 1) \ 
P 2 / 


x(p) = 0, 


(13.83) 


where / (p) = R(p/P). A comparison with Eq. (13.78) for <I>J;(.x) shows that 
Eq. (13.83) is satisfied by 


X(p) = e-^p'-L'it+l.ipl (13.84) 


in which k is replaced by 2L + 1 and n by X — L — 1. 

We must restrict the parameter X by requiring it to be an integer n,n = 
1, 2, 3, ... . 8 This is necessary because the Laguerre function of nonintegral n 
would diverge as p n e p , which is unacceptable for our physical problem with 
the boundary condition 


lim R(r ) = 0. 

r—>o o 


This restriction on X, imposed by our boundary condition, has the effect of 
quantizing the energy 


Z 2 m / e 2 \ 2 
” 2ri 2 h 2 \47reo/ 


(13.85) 


The negative sign indicates that we are dealing here with bound states, as 
E — 0 corresponds to an electron that is just able to escape to infinity, where 
the Coulomb potential goes to zero. Using this result for E n , we have 


me 2 Z _2Z 2Z 

27te 0 h 2 n na 0 ’ P na, 0 


(13.86) 


with 


a 0 = 


4jreo^ 2 


the Bohr radius. 


me- 


Thus, the final normalized hydrogen wave function is written as 


^nLM(r, 0,<p) = 


2zy (n-L-iy: 


_\naoJ 2 n(n+Ly. 


1/2 


e~^%8r) L I%£ 1 (j}r)Y?(P, <pl 

(13.87) 


Laguerre polynomials arise as solutions of the Coulomb potential in quantum 
mechanics. Separating the Schrodinger equation in spherical polar coordinates 
defines their radial ODE. The Sturm-Liouville theory of this ODE implies their 
properties. 


8 This is the conventional notation for A. It is not the same n as the index n in <t>*(X). 
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EXERCISES 


13.2.1 Show with the aid of the Leibniz formula that the series expansion 
of L n (x ) [Eq. (13.58)] follows from the Rodrigues representation 
[Eq. (13.57)]. 


13.2.2 (a) Using the explicit series form [Eq. (13.58)], show that 

L' n { 0) = -n 
Ki 0) = \n(n- 1). 


(b) Repeat without using the explicit series form of L n (x). 

13.2.3 From the generating function derive the Rodrigues representation 


L*(x) = 


e x x k d n 
n\ dx n 


0 e~ x x n+k ). 


13.2.4 Derive the normalization relation [Eq. (13.75)] for the associated La¬ 
guerre polynomials. 

13.2.5 Expand x r in a series of associated Laguerre polynomials L k Jx), k 
fixed and n ranging from 0 to r (or to oo if r is not an integer). 

Hint. The Rodrigues form of L k n (x) will be useful. 


ANS. x r 


(r + ky.rl ^ 

n =0 


(-1 TL k n (x) 
(n+ky.(r — ri)\ ’ 


0 < x < oo. 


13.2.6 Expand er ax in a series of associated Laguerre polynomials L k (x), k 
fixed and n ranging from 0 to oo. Plot partial sums as a function of the 
upper limit N — 1, 2, ..., 10 to check the convergence. 

(a) Evaluate directly the coefficients in your assumed expansion. 

(b) Develop the desired expansion from the generating function. 


ANS. e~ ax 


1 

(1 + a) l+k 



0 < x < oo. 


13.2.7 Show that 

/ e~ x x k+1 L k (x)L k (x) dx = --(2 n + k + 1). 

Jo n\ 

Hint. Note that 

xL k n = (2 n + k + 1 )L* - (n + k)L k n _, ~(n+ Y)L k n+v 

13.2.8 Assume that a particular problem in quantum mechanics has led to 
the ODE 


d 2 y 

dx 2 


~k 2 - 1 
4:r 2 


2 n+k +1 
2x 


1 

4 


2 /= 0 . 
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Write y(x) as 


y(x) — A(x)B(x)C(x) 

with the requirements that 

(a) A(x) be a negative exponential giving the required asymptotic 
behavior of ?/.%'); and 

(b) B(x) be a positive power of x giving the behavior of y(x) for 

0 < x <<C 1. 

Determine A(x) and B(x). Find the relation between C(x) and the 
associated Laguerre polynomial. 

ANS. A{x') = e~ x ' 2 , B(x) = x (k+1 V 2 , C(x) = L\{x). 


13.2.9 From Eq. (13.87) the normalized radial part of the hydrogenic wave 
function is 


B nL O') — 


, (w-L-l)! 
2 n(n+ L)l 


1/2 




Lt 2L+1 


where j> — 2Z/nao — Zme 2 /(2jteoh 2 ). Evaluate 

poo 

(a) (r) = / rR r jXPr)R nL (.Pr)r 2 dr, 

Jo 

poo 

(b) (r _1 ) = / r- 1 R n L(Pr)R nL (j}ryt 2 dr. 

Jo 

The quantity (r) is the average displacement of the electron from the 
nucleus, whereas (r _1 ) is the average of the reciprocal displacement. 


ANS. (r) = ^ [3m 2 - L(L + 1)] {r~ l ) = 

2 nra 0 

13.2.10 Derive the recurrence relation for the hydrogen wave function expec¬ 
tation values: 

S A^(r s+l ) - (2s + 3)a 0 (r s > + 1±±[(2L + l) 2 - (s+ l) 2 ]^" 1 ) = 0, 
nr 4 

with s > —2 L — 1 and ( r s ) defined as in Exercise 13.2.9(a). 

Hint. Transform Eq. (13.83) into a form analogous to Eq. (13.78). Mul¬ 
tiply by p s+2 v! — cp s+1 u. Here, u — pd>. Adjust c to cancel terms that 
do not yield expectation values. 

13.2.11 The hydrogen wave functions, Eq. (13.87), are mutually orthogonal, 
as they should be since they are eigenfunctions of the self-adjoint 
Schrodinger equation 

&niri2 ^L\L2 8 miM 2 • 
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However, the radial integral has the (misleading) form 

/>00 

/ e _/3r/2 {fir) L L 2 ^+l_ j {fir)e ~ M2 (M L LI l +[_ i {Mr 2 dr, 

Jo 

which appears to match Eq. (13.79) and not the associated Laguerre 
orthogonality relation [Eq. (13.75)]. How do you resolve this paradox? 

ANS. The parameter f> is dependent on n. The first three f) pre¬ 

viously shown are 2Z/n\ao. The last three are 2 Z/n 2 ao- 
For ri\ = n 2 , Eq. (13.79) applies. For m ^ n 2 , neither 
Eq. (13.75) nor Eq. (13.79) is applicable. 


13.2.12 A quantum mechanical analysis of the Stark effect (in parabolic coor¬ 
dinates) leads to the ODE 


d 

eg 





m 2 



u—0, 


where F is a measure of the perturbation energy introduced by an 
external electric field. Find the unperturbed wave functions (F = 0) 
in terms of associated Laguerre polynomials. 


ANS. Mg) = e" E?/2 | m/2 L^(e|), with s = V=2 E > 0, 
p = a/s — (m+ l)/2, a nonnegative integer. 


13.2.13 The wave equation for the three-dimensional harmonic oscillator is 

h 2 o 1 

~2M V ^ + 2 Mu>2r2 ^ = E 

where co is the angular frequency of the corresponding classical oscil¬ 
lator. Show that the radial part of i/r (in spherical polar coordinates) 
may be written in terms of associated Laguerre functions of argument 
G6r 2 ), where fi — Mw/h. 

Hint. As in Exercise 13.2.8, split off radial factors of r l and e~ pr /2 . 
The associated Laguerre function will have the form i)/ 2 C/b' 2 )- 

13.2.14 Write a program (in Basic or Fortran or use symbolic software) that 
will generate the coefficients a s in the polynomial form of the Laguerre 
polynomial, L a (x) = )T' S ' =0 a s x s . 

13.2.15 Write a subroutine that will transform a finite power series J2n=o a nX n 
into a Laguerre series J2n=o W4 Use the recurrence relation Eq. 
(13.60). 

13.2.16 Tabulate Lw(x) for x = 0.0 to 30.0 in steps of 0.1. This will include the 
10 roots of L]( t . Beyond x — 30.0, Lio (x) is monotonically increasing. 
Plot your results. 


Check value. Eighth root = 16.279. 
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Additional Reading 


Abramowitz, M., and Stegun, I. A. (Eds.) (1964). Handbook of Mathematical 
Functions, Applied Mathematics Series-55 (AMS-55). National Bureau of 
Standards, Washington, DC. Paperback edition, Dover, New York (1974). 
Chapter 22 is a detailed summary of the properties and representations of 
orthogonal polynomials. Other chapters summarize properties of Bessel, 
Legendre, hypergeometric, and confluent hypergeometric functions and 
much more. 

Erdelyi, A., Magnus, W., Oberhettinger, F., and Tricomi, F. G. (1953). Higher 
Transcendental Functions. McGraw-Hill, New York. Reprinted, Krieger, 
Melbourne, FL (1981). A detailed, almost exhaustive listing of the proper¬ 
ties of the special functions of mathematical physics. 

Lebedev, N. N. (1965). Special Functions and Their Applications (R. A. 
Silverman, Trans.). Prentice-Hall, Englewood Cliffs, NJ. Paperback, Dover, 
New York (1972). 

Luke, Y. L. (1969). The Special Functions and Their Approximations. Aca¬ 
demic Press, New York. Volume 1 is a thorough theoretical treatment of 
gamma functions, hypergeometric functions, confluent hypergeometric 
functions, and related functions. Volume 2 develops approximations and 
other techniques for numerical work. 

Luke, Y. L. (1975). Mathematical Functions and Their Approximations. 
Academic Press, New York. This is an updated supplement to Handbook 
of Mathematical Functions with Formulas, Graphs and Mathematical 
Tables (AMS-55). 

Magnus, W., Oberhettinger, F., and Soni, R. P. (1966). Formulas and Theorems 
for the Special Functions of Mathematical Physics. Springer, New York. 
An excellent summary of just what the title says, including the topics of 
Chapters 10-13. 

Rainville, E. D. (1960). Special Functions. Macmillan, New York. Reprinted, 
Chelsea, New York (1971). This book is a coherent, comprehensive account 
of almost all the special functions of mathematical physics that the reader 
is likely to encounter. 

Sansone, G. (1959). Orthogonal Functions (A. H. Diamond, Trans.). Inter¬ 
science, New York. Reprinted, Dover, New York (1991). 

Sneddon, I. N. (1980). Special Functions of Mathematical Physics and Chem¬ 
istry, 3rd ed. Longman, New York. 

Thompson, W. J. (1997). Atlas for Computing Mathematical Functions: An 
Illustrated Guidebook for Practitioners with Programs in Fortran 90 
and Mathematica. Wiley, New York. 

Whittaker, E. T., and Watson, G. N. (1997). A Course of Modem Analysis 
(reprint). Cambridge Univ. Press, Cambridge, UK. The classic text on 
special functions and real and complex analysis. 



_ 



Fourier Series 



Periodic phenomena involving waves [~ sin(2;ra;/A.) as a crude approximation 
to water waves, for example], motors, rotating machines (harmonic motion), 
or some repetitive pattern of a driving force are described by periodic func¬ 
tions. Fourier series are a basic tool for solving ordinary differential equa¬ 
tions (ODEs) and partial differential equations (PDEs) with periodic bound¬ 
ary conditions. Fourier integrals for nonperiodic phenomena are developed in 
Chapter 15. The common name for the whole field is Fourier analysis. 

A Fourier series is defined as an expansion of a function or representation 
of a function in a series of sines and cosines, such as 



(14.1) 


sm nx. 


The coefficients a<j, a n , and b n are related to the periodic function f(oc) by 
definite integrals: 



(14.2) 



(14.3) 


This result, of course, is subject to the requirement that the integrals exist. 
They do if f{pc) is piecewise continuous (or square integrable). Notice that ao 
is singled out for special treatment by the inclusion of the factor |. This is done 
so that Eq. (14.2) will apply to all a n ,n= 0 as well as n > 0. 

The Sturm-Liouville theory of the harmonic oscillator in Example 9.1.4 
guarantees the validity of Eqs. (14.2) and (14.3) and, by use of the orthogonality 
relations (Example 9.2.1), allows us to compute the expansion coefficients. 
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Another way of describing what we are doing here is to say that f{pc) is part 
of an infinite-dimensional Hilbert space, with the orthogonal cos nx and sin nx 
as the basis. The statement that cos nx and sin nx in = 0, 1, 2,...) span this 
Hilbert space is equivalent to saying that they form a complete set. Finally, the 
expansion coefficients a n and b n correspond to the projections of fix), 
with the integral inner products [Eqs. (14.2) and (14.3)] playing the role of the 
dot product of Section 1.2. These points are outlined in Section 9.4. 

The conditions imposed on fix) to make Eq. (14.1) valid, and the series 
convergent, are that fix) has only a finite number of finite discontinuities and 
only a finite number of extreme values, maxima, and minima in the interval 
[0, 27T]. 1 Functions satisfying these conditions are called piecewise regular. 
The conditions are known as the Dirichlet conditions. Although there are 
some functions that do not obey these Dirichlet conditions, they may well be 
labeled pathological for purposes of Fourier expansions. In the vast majority of 
physical problems involving a Fourier series, these conditions will be satisfied. 
In most physical problems we shall be interested in functions that are square 
integrable, for which the sines and cosines form a complete orthogonal set. 
This in turn means that Eq. (14.1) is valid in the sense of convergence in the 
mean (see Eq. 9.63). 


Completeness 


The Fourier expansion and the completeness property may be expected be¬ 
cause the functions sir \nx, cos nx, e mx are all eigenfunctions of a self-adjoint 
linear ODE, 

y " + n 2 y= 0. (14.4) 


We obtain orthogonal eigenfunctions for different values of the eigenvalue n 
for the interval [0, 2jt ] to satisfy the boundary conditions in the Sturm-Liouville 
theory (Chapter 9). The different eigenfunctions for the same eigenvalue n are 
orthogonal. We have 


/»2 7 T 

L 1 


sin mx sin nx dx = 


7tS mn , TO^O, 
0, TO= 0, 


(14.5) 


/»27T 

L ' 


cos mx cos nx dx = 


It m=f 0 , 

2j r, m—n= 0, 


(14.6) 


r*2n 

/ sin mx cos nx dx — 0, for all integral to and n. (14.7) 
Jo 

Note that any interval xo < x < Xo + 2,it will be equally satisfactory. Fre¬ 
quently, we use Xo = —tt to obtain the interval —tx < x < : r. For the complex 
eigenfunctions e ±mx , orthogonality is usually defined in terms of the complex 


1 These conditions are sufficient but not necessary. 
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conjugate of one of the two factors, 



{e l m xy e inx dx = 2 


(14.8) 


This agrees with the treatment of the spherical harmonics (Section 11.5). 


EXAMPLE 14.1.1 


Sawtooth Wave Let us apply Eqs. (14.2) and (14.3) to the sawtooth shape 
shown in Fig. 14.1 to derive its Fourier series. Our sawtooth function can also 
be expressed as 


/O) = 


x, 

x — 27T, 


0 < x < j r, 

7T < x < 2 tt, 


which is an odd function of the variable x. Hence, we expect a pure sine 
expansion. Integrating by parts, we indeed find 


a, 


i r 

X j —7 


xcosnxdx = sinn.r 
rnx 


i r 

— / S1 

nit J_„ 


sin nxdx 


cos nx 


[(-1)" -(-!)”] = 0, 


while 


u 


- , x 

b n = — / xsmnxdx = -cosna; 


717T 


1 f 71 
— / sir 
nn 


H-/ sin nxdx 

nn 


2 r nn 1 

= —(— 1 )- 5 — cosna; 

n n z n 


= —(—I)"- 

n 


This establishes the Fourier expansion 

sin 2a; sin 3a; 


fix') = 2 


smx- 


■•■ + (-i) 


,«+1 


sin nx 
n 


= 2 £(-iy 


n +1 


sinner 


= X, — IX < X < 7T, 


n =1 


n 


(14.9) 


Figure 14.1 



Sawtooth Wave 
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which converges only conditionally, not absolutely, because of the disconti¬ 
nuity of f{pc) at x = ± 7 r. It makes no difference whether a discontinuity is 
in the interior of the expansion interval or at its ends: It will give rise to con¬ 
ditional convergence of the Fourier series. In terms of physical applications 
with x = (o a frequency, conditional convergence means that our square wave 
is dominated by high-frequency components. ■ 


Behavior of Discontinuities 


The behavior at x = rat is an example of a general rule that at a finite dis¬ 
continuity the series converges to the arithmetic mean. For a discontinuity at 
x = Xo the series yields 


fix o) = ^[fix o + 0) + fix o - 0)], 


(14.10) 


where the arithmetic mean of the right and left approaches to x = Xq. A general 
proof using partial sums is given by Jeffreys and by Carslaw (see Additional 
Reading). 

An idea of the convergence of a Fourier series and the error in using only 
a finite number of terms in the series may be obtained by considering the 
expansion of the sawtooth shape of Fig. 14.1. Figure 14.2 shows fix) for 0 < 
x < j x for the sum of 4, 6, and 10 terms of the series. Three features deserve 
comment: 


1. There is a steady increase in the accuracy of the representation as the 
number of terms included is increased. 

2. All the curves pass through the midpoint fix) = 0 at x = n. 

3. In the vicinity of x = it there is an overshoot that persists and shows no 
sign of diminishing. This overshoot (and undershoot) is called the Gibbs 
phenomenon and is a typical feature of Fourier series. The inclusion of 
more terms does nothing to remove the overshoot (undershoot) but merely 
moves it closer to the point of discontinuity. The Gibbs phenomenon is not 


Figure 14.2 
Fourier 

Representation of 
Sawtooth Wave 
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Figure 14.3 
Full Wave Rectifier 



limited to Fourier series. It occurs with other eigenfunction expansions. For 
more details, see W. J. Thomson, Fourier series and the Gibbs phenomenon, 
Am. J. Phys. 60, 425 (1992). 


One of the advantages of a Fourier representation over some other 
representation, such as a Taylor series, is that it can represent a dis¬ 
continuous function. An example is the sawtooth wave in the preceding 
section and Example 14.1.3. Other examples are considered in the exercises. 


EXAMPLE 14.1.2 


Full-Wave Rectifier Consider the case of an absolutely convergent Fourier 
series representing a continuous periodic function, displayed in Fig. 14.3. Let 
us ask how well the output of a full-wave rectifier approaches pure direct 
current. Our rectifier may be thought of as having passed the positive peaks 
of an incoming sine wave and inverting the negative peaks. This yields 


fit) = sin&>£, 0 < cot < 7T, 

f{t ) = — sincof, — 7t < cot < 0. 


(14.11) 


Since / (i) defined here is even, no terms of the form sin ncot will appear. Again, 
from Eqs. (14.2) and (14.3), we have 

1 r° 1 r 

do = -/ sin cot dicot ) H— / sin cot dicot ) 

77 J-TI 71 JO 

2 r 4 

= — / sin cot dicof) — —, (14.12) 

TT Jo 7T 

2 r 

a n = — I sin cot cos ncot dicot ) 

77 Jo 

2 2 

=- 5 — 7 , to even, 

TT Wr — 1 

= 0 , to odd. 


(14.13) 
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Figure 14.4 
Square Wave 



Note that (0, tc) is not an orthogonality interval for both sines and cosines 
together and we do not get zero for even n. The resulting series is 


m= 


2 

7T 


4 ^ 

IT ^' 

71=2,4,6,. 


cos na>t 
n 2 — 1 


(14.14) 


The original frequency co has been eliminated. The lowest frequency oscillation 
is 2o>. The high-frequency components fall off as n ~ 2 , showing that the full-wave 
rectifier does a fairly good job of approximating direct current. Whether this 
good approximation is adequate depends on the particular application. If the 
remaining ac components are objectionable, they may be further suppressed 
by appropriate filter circuits. These two examples highlight two features char¬ 
acteristic of Fourier expansions: 2 


• If f(x) has discontinuities (as in the sawtooth wave in Example 14.1.1), we 
can expect the nth coefficient to be decreasing as 0(1/ri). Convergence is 
conditional only. 3 

• If /( x ) is continuous (although possibly with discontinuous derivatives as in 
the full-wave rectifier of Example 14.1.2), we can expect the nth coefficient 
to be decreasing as 1/n 2 , that is, absolute convergence. ■ 


EXAMPLE 14.1.3 


Square Wave-High Frequencies One application of Fourier series, the 
analysis of a “square” wave (Fig. 14.4) in terms of its Fourier components, 
occurs in electronic circuits designed to handle sharply rising pulses. This 
example explains the physical meaning of conditional convergence. Suppose 
that our wave is defined by 


f(x ) = 0, —IT < X < 0, 

f(x ) = h, 0 < X < 7T. 


(14.15) 


2 Raisbeek, G. (1955). Order of magnitude of Fourier coefficients. Am. Math. Mon. 62, 149-155. 

3 A technique for improving the rate of convergence is developed in the exercises of Section 5.9. 
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SUMMARY 


From Eqs. (14.2) and (14.3) we find 


a n — 

bn 


i r 
n Jo 


a 0 = — hdt = h, 

7X 


i r 

— I hcosntdt = 0, n— 1,2,3,..., 
n Jo 


i r h 

— I hsmnt dt = —(1 — costur); 
it Jo nn 


(14.16) 

(14.17) 

(14.18) 


The resulting series is 


fix) = - + - 

L-J 71 


2h 

bn — j 

nodd, 


(14.19) 

nn 

bn — 

n even. 


(14.20) 

, f sin x 

sin 3x 

sin 5x \ 

(14.21) 

1 1 + 

3 ' 

5 ' ■ 


Except for the first term, which represents an average of f(x) over the in¬ 
terval [— 7 r, 7T], all the cosine terms have vanished. Since f(x) — h/2 is odd, 
we have a Fourier sine series. Although only the odd terms in the sine series 
occur, they fall only as n~ 1 . This conditional convergence is like that of the 
harmonic series. Physically, this means that our square wave contains a lot of 
high-frequency components. If the electronic apparatus will not pass these 
components, our square wave input will emerge more or less rounded off, 
perhaps as an amorphous blob. ■ 


Biographical Data 

Fourier, Jean Baptiste Joseph, Baron. Fourier, a French mathematician, 
was bom in 1768 in Auxerre, France, and died in Paris in 1830. After his grad¬ 
uation from a military school in Paris, he became a professor at the school in 
1795. In 1808, after his great mathematical discoveries involving the series 
and integrals named after him, he was made a baron by Napoleon. Earlier, 
he had survived Robespierre and the French Revolution. When Napoleon re¬ 
turned to France in 1815 after his abdication and first exile to Elba, Fourier 
rejoined him and, after Waterloo, fell out of favor for a while. In 1822, his 
book on the Analytic Theory of Heat appeared and inspired Ohm to new 
thoughts on the flow of electricity. 


Fourier series are finite or infinite sums of sines and cosines that describe 
periodic functions that can have discontinuities and thus represent a wider 
class of functions than we have considered so far. Because sin nx, cos nx are 
eigenfunctions of a self-adjoint ODE, the classical harmonic oscillator equa¬ 
tion, the Hilbert space properties of Fourier series are consequences of the 
Sturm-Liouville theory. 
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EXERCISES 


14.1.1 A function fix) (quadratically integrable) is to be represented by a 
finite Fourier series. A convenient measure of the accuracy of the 
series is given by the integrated square of the deviation 


A p — 


/»2 7 T 

Jo 


fix) —^ cos nx + b n sin nx) 

£-1 


71= 1 


dx. 


Show that the requirement that A /( be minimized, that is, 


9 A, 

dCL n 


= o, 


9Aj 

96 m 


= 0, 


for all n, leads to choosing a n and b n , as given in Eqs. (14.2) and (14.3). 
Note. Your coefficients a n and b n are independent of p. This indepen¬ 
dence is a consequence of orthogonality and would not hold for powers 
of x, fitting a curve with polynomials. 

14.1.2 In the analysis of a complex waveform (ocean tides, earthquakes, mu¬ 
sical tones, etc.) it might be more convenient to have the Fourier series 
written as 


a 0 v—> 

fix') = Y + l^ an cos i nx - On). 


71 = 1 


Show that this is equivalent to Eq. (14.1) with 

a n = a n cos 0 , a 2 n = a 2 n + b\, 
b n = a n sin 6, tan 0 n = b n /a n . 


Note. The coefficients a^asa function of n define what is called the 
power spectrum. The importance of ol\ lies in its invariance under a 
shift in the phase 6 n . 

14.1.3 Assuming that f* [f (x)\ 2 dx is finite, show that 


lim a m = 0, lim b m = 0. 

771 -^ 0 o m—x-oo 


Hint. Integrate [fix) — .s n (.r)j 2 , where s n (pc) is the nth partial sum, and 
use Bessel’s inequality (Section 9.4). For our finite interval the assump¬ 
tion that f{x) is square integrable \f(x)\ 2 dx is finite) implies that 
ff \fix)\ dx is also finite. The converse does not hold. 

14.1.4 Apply the summation technique of this section to show that 


OO 


E 

71 =1 


sin nx 
n 


g(jr — x), 0 < x < tt 

— ^(n+x), —tt<x< 0 


(Fig. 14.5). 
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Figure 14.5 

Reverse Sawtooth 
Wave 



14.1.5 Sum the trigonometric series 


yk sin(2n + l).x 
^ 2 n+ 1 ’ 

n=0 

and show that it equals 

7T/4, 0 < X < 7t 

—7t/4, — 7 r < x < 0. 


0 


14.2 


Advantages and Uses of Fourier Series 


i 

Periodic Functions 


Related to the advantage of describing discontinuous functions is the useful¬ 
ness of a Fourier series in representing a periodic function. If /(a;) has a period 
of 2tt, perhaps it is only natural that we expand it in a series of functions with 
period 27r, 27r/2, 2 tt/ 3,_This guarantees that if our periodic fix') is repre¬ 

sented over one interval [0, 2tt] or [—n, n], the representation holds for all 
finite x. 

At this point, we may conveniently consider the properties of symmetry. 
Using the interval [—jr, tt], sin x is odd and cos a; is an even function of x. 
Hence, by Eqs. (14.2) and (14.3), 4 if fix) is odd, all a n — 0, and if fix) is even, 


4 With the range of integration —tt<x<tt. 
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all b n — 0. In other words, 

„ °° 

(JjQ % > 

fix) = — + y a n cos nx, /(a;) even, (14.22) 

n =1 

oo 

/(.%■) = b n sin nx, fix) odd. (14.23) 

n =1 

Frequently, these properties are helpful in expanding a given function. 

We have noted that the Fourier series is periodic. This is important in 
considering whether Eq. (14.1) holds outside the initial interval. Suppose we 
are given only that 


fix') = x, 0 < x < n 


(14.24) 


and are asked to represent fix) by a series expansion. Let us take three of the 
infinite number of possible expansions: 

1. If we assume a Taylor expansion, we have 

fix) = x, (14.25) 


a one-term series. This (one-term) series is defined for all finite x. 

2. Using the Fourier cosine series [Eq. (14.22)], we predict that 


fix) — —x, —n<x< 0, 

fix) — 2jt — x, 7r < x < 2jt. 

3. Finally, from the Fourier sine series [Eq. (14.23)], we have 

fix) = x, —n<x<0, 

fix) = x—2n, tt<x< 2 it. 


(14.26) 


(14.27) 


These three possibilities—Taylor series, Fourier cosine series, and Fourier 
sine series—are each perfectly valid in the original interval [0, tt]. Outside, 
however, their behavior is strikingly different (compare Fig. 14.6). Which 
of the three, then, is correct? This question has no answer, unless we are 
given more information about fix). It may be any of the three or none of 
them. Our Fourier expansions are valid over the basic interval. Unless the 
function / ix) is known to be periodic, with a period equal to our basic in¬ 
terval, or (1 /n)th of our basic interval, there is no assurance whatever that 
the representation [Eq. (14.1)] will have any meaning outside the basic inter¬ 
val. Clearly, the interval of length 27r, which defines the expansion, makes a 
real difference for a nonperiodic function because the Fourier series repeats 
the pattern of the basic interval in adjacent intervals. This also follows from 
Fig. 14.6. 

In addition to the advantages of representing discontinuous and periodic 
functions, there is a third very real advantage in using a Fourier series. Suppose 
that we are solving the equation of motion of an oscillating particle, subject to 
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Figure 14.6 

Comparison of Fourier 
Cosine Series, Fourier 
Sine Series, and Taylor 
Series 



a periodic driving force. The Fourier expansion of the driving force then gives 
us the fundamental term and a series of harmonics. The (linear) ODE may be 
solved for each of these harmonics individually, a process that may be much 
easier than dealing with the original driving force. Then, as long as the ODE 
is linear, all the solutions may be added together to obtain the final solution. 5 
This is more than just a clever mathematical trick. 

• It corresponds to finding the response of the system to the funda¬ 
mental frequency and to each of the harmonic frequencies, called Fourier 
analysis. 

The following question is sometimes raised: Were the harmonics there all along, 
or were they created by our Fourier analysis? One answer compares the func¬ 
tional resolution into harmonics with the resolution of a vector into rectangular 
components. The components may have been present in the sense that they 
may be isolated and observed, but the resolution is certainly not unique. Hence, 
many authorities prefer to say that the harmonics were created by our choice 
of expansion. Other expansions in other sets of orthogonal functions would 
give different results. For further discussion, we refer to a series of notes and 
letters in the American Journal of Physics. 6 


6 One of the nastier features of nonlinear differential equations is that this principle of superposition 
is not valid. 

0 Robinson, B. L. (1953). Concerning frequencies resulting from distortion. Am. J. Phys. 21, 
391; Van Name, F. W., Jr. (1954). Concerning frequencies resulting from distortion. Am. J. Phys. 
22, 94. 
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Change of Interval 


So far, attention has been restricted to an interval of length 2n. This restriction 
may easily be relaxed. If f{pc) is periodic with a period 2 L, we may write 


, , do r nnx njTXi 

fix) = — + [ a n COS —j— + b n sin —j— \ , (14.28) 

n =1 

with 

1 f L nnt 

a n = — / fit) cos ——- dt, n = 0, 1, 2, 3,..., (14.29) 

L J-l L 


bn — 



n= 1,2,3,..., 


(14.30) 


replacing x in Eq. (14.1) with ttx/L and t in Eqs. (14.2) and (14.3) with nt/L. 
[For convenience, the interval in Eqs. (14.2) and (14.3) is shifted to —tt < t < 
it.] The choice of the symmetric interval [— L, L] is not essential. For fix) 
periodic with a period of 2 L, any interval (.x'o, ,r,'o + 2 L) will do. The choice is 
a matter of convenience or literally personal preference. 

If our function fi~x) — fix) is even in x, then it has a pure cosine series 
because, by substituting t -> — t, 



so that all b n — 0 and 


n n — 


2 

L 



nitt 

cos ——dt. 

1j 


Similarly, if fi—x) = —fix) is odd in x, then .fix) has a pure sine series 
with coefficients 


b n — 


2 

L 



mtt 

sin ——dt. 


EXAMPLE 14.2.1 


Asymmetric Square Wave Let us derive the Fourier expansion for the 
square wave in Fig. 14.7: 


fix) = h, 0 < x < 1, 
fix) = 0, 1 < x < 2L 


for L ^ 1 and > 0. Th6 expansion interval is [—Z/]. 
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Figure 14.7 

Asymmetric Square Wave 



The geometry of the square pulse implies that its Fourier coefficients are 
given by Eqs. (14.29) and (14.30) as 


h f 1 mtx In mtx 1 

a n = — / cos- dx = — sin- 

L J o L mt L 0 


— sin—, 7&=0,1,..., 

mt L 


bn — 


h 

L 



mtx h 

sin- dx = - 

L mt 


cos 


mtx 

~lP 


h 

mt 


1 — cos 


mt 


n= 1,2,.... 


The resulting Fourier series 




mt 


cos 


mtx 


sin 


L 


for L = 1 agrees with the square wave series of Example 14.1.3 because 
sin mt = 0 and cos mt = (—1)™. 

Because / is symmetric about x = (dashed vertical line in Fig. 14.7), 
we shift the origin of our coordinates from x = 0 to this new origin, calling 
% = x— the new variable. We expect a pure cosine series with coefficients 


A =- 
” L 


L 


L+1/2 mt% h mt% 

cos — —d% = — sin 

£- 1/2 


h 

mt 


sin 


L 

mt(2L + 1) 


mt 


£+1/2 


£- 1/2 


2 L 


sin 


mt(2L - 1) 
2 L 


2 h . mt 
= (-1) —sin—, 
mt ZL 


and B n — 0. The Fourier series 


Oh 00 

/w=-E 

71 “ 
n =1 


(-i)" 


mt mt{x—L — 1/2) 
sin — cos 

uLj 


n 


L 
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can be seen to be equivalent to our first result using the addition formulas for 
trigonometric functions 


cos 


rni[x — L — 1/2) 
L 



7171 

COS — cos 

uLi 


mix 



mx 

sin — sin 

Lj±J 


mix 

~L~ 


EXERCISES 


14.2.1 The boundary conditions [such as i/r(0) = f(l) = 0] may suggest solu¬ 
tions of the form sm(njix/l~) and eliminate the corresponding cosines. 

(a) Verify that the boundary conditions used in the Sturm-Liouville 
theory are satisfied for the interval (0, i). Note that this is only half 
the usual Fourier interval. 

(b) Show that the set of functions <p n (x) = sm(rmx/1), n = 1, 2, 3, ... 
satisfies an orthogonality relation 


/' 


1 l 

(p m (X) iPn (X) (l X — —8 mn , 


n > 0. 


14.2.2 (a) Expand fix) — x in the interval [0, 2 L\. Sketch the series you have 
found (right-hand side of Answer) over [—2L, 2L]. 


ANS. x=L 


2 L 


E l (mix\ 

- sin —— . 
n \ L J 

n=L x 7 


(b) Expand f(x) = x as a sine series in the half interval (0, L ). Sketch 
the series you have found (right-hand side of Answer) over 
[-2L, 2L]. 


4 ^2, 

ANS. x=-J2 


7i 2n + 1 

n =0 


sin 


(2 n + l)7ra; 


14.2.3 In some problems it is convenient to approximate sin tix over the inter¬ 
val [0, 1] by a parabola ax( l — x), where a is a constant. To get a feeling 
for the accuracy of this approximation, expand 4.i;(l — x) in a Fourier 
sine series: 


fix) = 


4x(l - x), 0 < x < 1 

4x(l + x), -1 < x < 0 


OO 

y, b n sin mix. 

n =1 


ANS. b n 


32 1 

jr 3 to 3 ’ 


to odd 


b n = 0, to even 


(Fig. 14.8). 

14.2.4 Take — Ti < x < 7i as the basic interval for the function fix) = x and 
repeat the arguments leading to Eqs. (14.25)-(14.27). Compare your 
results with those in Fig. 14.6 and plot them. 
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Figure 14.8 
Parabolic Sine Wave 



14.3 Complex Fourier Series 


One way of summing a trigonometric Fourier series is to transform it into 
exponential form and compare it with a Laurent series (see Section 6.5). If we 
expand f(z) in a Laurent series [assuming f(z) is analytic], 

OO 

m= Y CnZ n , (14.31) 

71 =—OO 


then on the unit circle z= e ie = cos 0 + i sin 0 by Euler’s identity so that 


m = f(e is )= Y c ^ ne - ( 14 - 32 ) 

71 — —OO 


Expressing cos nx and sin nx in exponential form and setting x — 0, 
cos nx = l + e~ iwc \ sin nx=)~ (e im ' - 

2 2i 

we may rewrite the general Fourier series [Eq. (14.1)] as 


/(*)=? + £ 
71 = 1 


a n 1 (e iMX + e ~ inx ) + b n l .(e irtx - e ~ inx ) 
2 2 i 


= E 


c n e 


(14.33) 


in which 


Cn = \{a n - ibn), C-n = ]^{a n + ibn), n> 0, c 0 = E 0 - (14.34) 


The Laurent expansion on the unit circle [Eq. (14.32)] has the same form as the 
complex Fourier series [Eq. (14.33)], which shows the equivalence between 
the two expansions. 

















678 


Chapter 14 Fourier Series 


When the function f(z) is known, the complex Fourier coefficients may be 
directly derived by projection from it: 

1 r 2n 

c n — — / e mx f(x)dx, — oo < n < oo, 

Zn Jo 

based on the orthogonality relation [Eq. (14.8)]. 



Consider a function f(z ) represented by a convergent power series 


OO OO 

m = E c ^ n = E c » rV " e • (14.35) 

n =0 n =0 

This is our Fourier exponential series [Eq. (14.32)]. Separating real and imag¬ 
inary parts 


OO OO 

u(r, 0) = c « r ™ cos v ( r > #) = E Cnr?> s ^ n n (14.36) 

n =0 n =1 


the Fourier cosine and sine series. Abel’s theorem asserts that if w( 1, 0) and 
r(l, 0) are convergent for a given 0, then 

m( 1, 0) + iv( 1, 0) = lim /(re iS ). (14.37) 

r—^1 

An application of this theorem appears as Exercise 14.3.11, and it is used in 
the next example. 


EXAMPLE 14.3.1 


Summation of a Complex Fourier Series Consider the series 
S-nii (1/n) cos nx, x e (0, 2jr). Since this series is only conditionally conver¬ 
gent (see Example 5.3.1) and diverges at x = 0 so that Dirichlet’s conditions 
are violated, we take 


OO 


E 

n—1 


COS TO 

n 



r n cos nx 
n 


(14.38) 


absolutely convergent for |r| < 1. Our procedure is again to try forming a 
power series by transforming the trigonometric functions into exponential 
form: 


OO 


E 

n =1 


r n cos nx 
n 


-E 
2 ti 


n 



(14.39) 


Now these power series may be identified as Maclaurin expansions of 

— ln(l — z), z = re l0C , re~ lx [Eq. (5.65)], and 


^ r n cos nx 
n 

n= 1 


1 

2 


[ln(l - re i,: ) + ln(l - re~ ix )\ 


= — ln[(l + r 2 ) — 2rcosx] 1/2 . 


(14.40) 
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Table 14.1 


Fourier Series 


- sm nx = 


— \(jz + x'), — it < x < 0 

\ (jz — x), 0 < x < 71 


sm nx = -x, —7i < x < tz 

n 2 


1. V - si 

n 

n =1 

oo i 

2. V(-i T +1 - 

“ n 

n =1 

o° ^ I 

3 - E o 2n^r sin(2M+1) H 

w=0 l 

^ cosmx f . /I*l\ 

n V 2 > 

n =1 L v 7 

oo , 

5. V(-1) H - 

*•—' n 


[ —n/4, —7i < x < 0 

+tt/4, 0 < x < 71 


— 71 < X < 77 


cos nx = — In 


2 cos 


©: 


£^TT cos(2w+1)x= ^ ln 


. \x[ 

cot- 


— 71 < X <71 


—71 < X <71 


7. £(-1) h 


12 ’ 


—71 < X < 71 


•^2, cos(2 n + Y)x 7T f tt t t ^ 

L, (2«, + l) 2 = 4 U “ |a 7’ 


— 71 < x < 71 


Reference 

Exercise 14.1.4 
Exercise 14.3.3 
Example 14.1.1 
Exercise 14.3.2 
Exercise 14.1.5 
Eq. (14.52) 

Eq. (14.38) 

Exercise 14.3.11 

(Item 5 - Item 4) 
Example 14.4.2 

Exercise 14.3.4 


Letting r = 1 based on Abel’s theorem, we obtain item 4 of Table 14.1 
yA cos nx 

n =1 


n 


= - ln(2 - 2 cos a;) 1/2 


= — In (2 sin - |, x e (0, 2: r). 7 (14.41) 


Both sides of this expression diverge as x -» 0 and 2n. 


EXERCISES 


14.3.1 Develop the Fourier series representation of 


m = 


0, —7T < cot < 0, 

sin<w(, 0 < cot < jt. 


This is the output of a simple half-wave rectifier. It is also an ap¬ 
proximation of the solar thermal effect that produces “tides” in the 
atmosphere. 


ANS. /(«)=- 

TC 


1 

- sin cot — 

2 


9 OO 

E 


cos ncot 
n 1 2 — 1 


7 The limits may be shifted to (—tt, tt) (and x ^ 0) using \x\ on the right-hand side. 
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14.3.2 A sawtooth wave is given by 

fix ) = x, —n < x < i r. 

Show that 

00 C— iy»+i 

fix ) = 2 V-sin rax 

f ra 

n=l 

14.3.3 A different sawtooth wave is described by 

-\iit+x), —jt<x<0 


fix) = 


+ i(7r — x), 0 <X<Jt. 


Show that f(x) = ’)T)ffsinnx/n). 

14.3.4 A triangular wave (Fig. 14.9) is represented by 


fix) = 

Represent fix) by a Fourier series. 


x, 0 < x < n 
—x, —7r < x < 0. 


ans. fix )= | - 4 y 

^ U n= 1.3,5,. 


cos nx 


n a 


Figure 14.9 
Triangular Wave 


Ax) 



14.3.5 Expand 


fix) = 


1, 

0, 


9 9 

/y» “ ^ /y* “ 

lA/ X iA/0 

9 9 

^y»£j /y»tj 

lA/ ^ iA/0 


in the interval [—7r, jt ]. 

Note. This variable-width square wave is important in electronic 


music. 
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Figure 14.10 

Cross Section of 
Split Tube 



14.3.6 A metal cylindrical tube of radius a is split lengthwise into two non¬ 
touching halves. The top half is maintained at a potential +V and the 
bottom half at a potential — V (Fig. 14.10). Separate the variables in 
Laplace’s equation and solve for the electrostatic potential for r < a. 
Observe the resemblance between your solution for r = a and the 
Fourier series for a square wave. 

14.3.7 A metal cylinder is placed in a (previously) uniform electric field, Eq, 
with the axis of the cylinder perpendicular to that of the original field. 

(a) Find the perturbed electrostatic potential. 

(b) Find the induced surface charge on the cylinder as a function of 
angular position. 


14.3.8 Expand <5 (x — t) in a Fourier series. 


ANS. S(x - 0 = — + - 
2iTT it 

1 1 , 

=-1— > cos n(x— t). 

2tt 7 r 

n =1 

14.3.9 Verify that 

i OO 

KVi - <P2) = 2“ E e im( ^ 

m =—oo 


(cos fix cos fit + sinn^sinnt) 


is a Dirac delta function by showing that it satisfies the definition of a 
Dirac delta function: 



OO 

E e im ^-^d<p2 = f(Vx)- 

m=—o o 


Hint. Represent /(<p 2 ) by an exponential Fourier series. 
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Note. The continuum analog of this expression is developed in Sec¬ 
tion 15.2. 


14.3.10 (a) Find the Fourier series representation of 

1 0, — jr < x < 0 
x, 0 < x < it. 

(b) From the Fourier expansion show that 


— 1 1 
~ 8 ~ + 32 + 52 + '"' 

14.3.11 Let f(z ) = ln(l + z) = J^°=i i—l) n+1 z n /n. (This series converges to 
ln(l + z) for \z\ < 1, except at the point z = —1.) 

(a) From the imaginary parts (item 5 of Table 14.1) show that 


0 


In 2cos- = XX -1 ) 


n+1 


n =1 


cos nO 


n 


—TT < 8 < TT. 


(b) Using a change of variable, transform part (a) into 


9 


In 2sin- =X(—!> 


,n +1 


1=1 


cos nO 


n 


0 < 9 < 27r. 


14.3.12 A symmetric triangular pulse of adjustable height and width is de¬ 
scribed by 


fipc) = 


a(l - x/b), 0 < \x\ < b 
0, b < \x\ < TT. 

(a) Show that the Fourier coefficients are 


ab 2 ab 9 

a 0 = —, a n =-(1 - cos nb)/(nbf. 

TT TT 

Sum the finite Fourier series through n= 10 and through n = 100 
for x/tt = 0(i/9)l. Take a — 1 and b = tt/2. 

(b) Call a Fourier analysis subroutine (if available) to calculate the 
Fourier coefficients of fix'), do through aio- 

14.3.13 (a) Using a Fourier analysis subroutine, calculate the Fourier cosine 

coefficients a o through a w of 

fix) — [1 — (a7/7r) 2 ] 1/2 , x&[— n, tt]. 

(b) Spot check by calculating some of the preceding coefficients by 
direct numerical quadrature. 

Check values, ao = 0.785, «2 = 0.284. 

14.3.14 A function fix) is expanded in an exponential Fourier series 


fix)= £ 


CnP 
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If fix) is real, f{pc) = fix), what restriction is imposed on the coef¬ 
ficients C n ? 


[ 


14.4 Properties of Fourier Series 


I 


Convergence 


Note that our Fourier series should not be expected to be uniformly convergent 
if it represents a discontinuous function. A uniformly convergent series of 
continuous functions (sin nx, cos nx) always yields a continuous function fix) 
(compare Section 5.5). If, however, (i) fix) is continuous, —it < x < n, 
(ii) /(—7r) = /(+7r), and (iii) fix) is sectionally continuous, the Fourier 
series for fix) will converge uniformly. These restrictions do not demand 
that / ix) be periodic, but they will be satisfied by continuous, differentiable, 
periodic functions (period of 2 jt). For a proof of uniform convergence we refer 
to the literature. 8 With or without a discontinuity in fix), the Fourier series 
will yield convergence in the mean (Section 9.4). 

If a function is square integrable, then Sturm-Liouville theory implies the 
validity of the Bessel inequality [Eq. (9.73)] 


OO 1 /»7T 

2a 2 0 + ^2 (a 2 + ft 2 ) < — / f 2 ix)dx (14.42) 

n =1 7I " 

so that at least the sum of squares of its Fourier coefficients converges. 


EXAMPLE 14.4.1 


Absolutely Convergent Fourier Series If the periodic function fix) has a 

bounded second derivative fix), then its Fourier series converges absolutely. 
Note that the converse is not valid, as the full-wave rectifier (Example 14.1.2) 
demonstrates. 

To show this, let us integrate by parts the Fourier coefficients 


1 /■* , , , / ix) sin nx 

a n =— / fix)cosnxdx= - 

7t nit 


i r 

nn J_ n 


— / f \x)smnxdx, 


where the integrated term vanishes. Because the first derivative fix) is boun¬ 
ded a fortiori, \a n \ = 0(-), and similarly b n —► 0 for n oc, at least as fast 
as 1/n, which is not sufficient for absolute convergence. However, another 
integration by parts can be done, yielding 


a n 


fix) cos nx 
n 2 jr 


71 


—71 


1 

n 2 n 



cos nxdx, 


(14.43) 


“See, for instance, Churchill, R. V. (1993). Fourier Series and Boundary Value Problems, 5th ed., 
Section 38. McGraw-Hill, New York. 
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where the integrated terms cancel each other because /' is continuous and 
periodic; that is, /'+) = /'(— jr). Since \f”(x)\ < M, we find the upper bound 

M r 2 M 

l««l < - 5 — / dx=— 2 , (14.44) 

which implies absolute convergence (by the integral test of Chapter 5), and 
the same applies to b n . ■ 

If the |a n |, ft,) < n~ a with 0 < at < 1, then we have at least conditional 
convergence, and the function fix) may have discontinuities. If a > 1, then 
there is absolute convergence by the integral test of Chapter 5. 


Integration 


Term-by-term integration of the series 


f{pc) = + 'Y^a n cosnx+ ^ ft,, si 

Li 


sin nx 


n =1 


n =1 


yields 


f 

Joco 


f(x)dx = ^ 


— SI 

n 


sin nx 


n =1 


E Un 

n 


cos nx 


71—1 


(14.45) 


(14.46) 


Clearly, the effect of integration is to place an additional power of n in the 
denominator of each coefficient. This results in more rapid convergence than 
before. Consequently, a convergent Fourier series may always be integrated 
term by term, with the resulting series converging uniformly to the integral of 
the original function. Indeed, term-by-term integration may be valid even if the 
original series [Eq. (14.45)] is not convergent. The function /( x ) need only be 
integrable. A discussion will be found in Jeffreys and Jeffreys, Section 14.06 
(see Additional Reading). 

Strictly speaking, Eq. (14.46) may not be a Fourier series; that is, if do ^ 0, 
there will be a term )ao.x;. However, 


f 

J X 0 


f(x)dx — -CLoX 


(14.47) 


will still be a Fourier series. 


EXAMPLE 14.4.2 


Integration of Fourier Series Consider the sawtooth series for 

f{pc) = x, —n < x < n. (14.48) 

Comparing with Exercise 14.1.1, the Fourier series is 


x 




71+1 


71=1 


sin nx 


n 


-7t < x < JC, 
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which converges conditionally, but not absolutely, because the harmonic series 
diverges. Now we integrate it and obtain item 7 of Table 14.1 


px oo 
J 0 n =1 


sinna; 


n 


dx = 2 ^2 


(- 1 ) 


n— 1 px 


n =1 




/■- 


sinnxcte 


oo /■_i yi—1 

= 2 V--—(1 — cosn^;) 

“ n* 

n =1 


= T + 2 E(-D 5 


n =1 


t cosn^r ar 
n 2 2 


using Exercise 14.4.1. Because | cos nx\ < 1 is bounded and the series 'Y_ n 1 /to 2 
converges, our integrated Fourier series converges absolutely to the limit 

f2xdx = x 2 /2 . ■ 


Differentiation 


The situation regarding differentiation is quite different from that of integra¬ 
tion. Here the word is caution. 


EXAMPLE 14.4.3 


Differentiation of Fourier Series Consider again the series of Exam¬ 
ple 14.4.2. Differentiating term by term, we obtain 


OO 

1 = 2 ^(-1) M+1 cos fix, (14.49) 

n =1 

which is not convergent for any value of x. Warning: Check the convergence 
of your derivative. 

For a triangular wave (Exercise 14.3.4), in which the convergence is more 
rapid (and uniform), 


^ 7r 4 cos nx 

f(x) = - - Y -o ' 

2 jt , AA 

n= l,odd 


Differentiating term by term, 


fix) =2 y 

n=l,odd 


sinner 


which is the Fourier expansion of a square wave 

nx)=\ !’ 


0 < X < 7T, 
7T< X < 0. 


(14.50) 


(14.51) 


(14.52) 


Inspection of Fig. 14.4 verifies that this is indeed the derivative of our trian¬ 
gular wave. ■ 
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• As the inverse of integration, the operation of differentiation has placed 
an additional factor n in the numerator of each term. This reduces the 
rate of convergence and may, as in the first case mentioned, render the 
differentiated series divergent. 

• In general, term-by-term differentiation is permissible under the same con¬ 
ditions listed for uniform convergence in Chapter 5. 

From the expansion of x and expansions of other powers of x, numerous 
other infinite series can be evaluated. A few are included in the subsequent 
exercises. 


EXERCISES 


14.4.1 Show that integration of the Fourier expansion of f(x) — x, —tt < x < 
tt, leads to 


12 


OO 


E 

n =1 


(-i) 


Ji +1 


n* 



i 

16 




14.4.2 Parseval’s identity. 

(a) Assuming that the Fourier expansion of f(oc) is uniformly conver¬ 
gent, show that 

- f U'(x)f dx=°^ + jr ('al + bl). 

«/ n n=l 


This is Parseval’s identity. It is actually a special case of the com¬ 
pleteness relation [Eq. (9.73)]. 

(b) Given 


x 


= — + 4 jr 

3 


(—l)”cosrer 


n =1 


n* 


-7T < X < Jt, 


apply Parseval’s identity to obtain ((4) in closed fonn. 

(c) The condition of uniform convergence is not necessary. Show this 
by applying the Parseval identity to the square wave 


/(*) = 


— 1, — it < x < 0 

1, 0 < x < it 


4 sin(2n — l)x 
7t " 2n— 1 

n =1 


14.4.3 Show that integrating the Fourier expansion of the Dirac delta function 
(Exercise 14.3.8) leads to the Fourier representation of the square wave 
[Eq. (14.21)], with h= 1. 

Note. Integrating the constant term (1/2jr ) leads to a term x/2 tt. What 
are you going to do with this? 
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14.4.4 Integrate the Fourier expansion of the unit step function 


/O) = 


o, 

x, 


—jt < x < 0 
0 < x < jt. 


Show that your integrated series agrees with Exercise 14.3.12. 


14.4.5 In the interval [—it, jt], 

8 n (x ) = n, for |acr| < l/(2n), 

0 , for \x\ > 1/(2 n) 

(Fig. 14.11). 

(a) Expand 8 n (x) as a Fourier cosine series. 

(b) Show that your Fourier series agrees with a Fourier expansion of 
8(x) in the limit as n —»■ oo. 


Figure 14.11 
Rectangular Pulse 



14.4.6 Confirm the delta function nature of your Fourier series of Exercise 
14.4.5 by showing that for any f(x) that is finite in the interval [— jt, jt] 
and continuous at x = 0, 


r 


./(.x)[Fourier expansion of <$oo0*0] dx — f( 0). 


14.4.7 Find the charge distribution over the interior surfaces of the semicircles 
of Exercise 14.3.6. 

Note. You obtain a divergent series and this Fourier approach fails. 
Using conformal mapping techniques, we may show the charge density 
to be proportional to esc 0. Does esc 9 have a Fourier expansion? 


14.4.8 Given 


<P\(.x) = Yl, 


sin nx 


n =1 

show by integrating that 


n 


<P2Q r) = Y 


cos nx 


n= 1 


n* 


—Qrr + x)/2, —TT < X < 0 

(jt — x)/2 0 < X < JT, 

(jt + xf/A — jt 2 /(12), —jt < x < 0 

(jt — xf /4 — tt 2 /(12), 0 < x < jt. 
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Additional Reading 


Carslaw, H. S. (1921). Introduction to the Theory of Fourier’s Series and 
Integrals, 2nd ed. Macmillan, London. Paperback, 3rd ed., Dover, New 
York (1952). This is a detailed and classic work. 

Hamming, R. W. (1973). Numerical Methods for Scientists and Engineers, 2nd 
ed. McGraw-Hill, New York. Reprinted, Dover, New York (1987). 

Jeffreys, H., and Jeffreys, B. S. (1972). Methods of Mathematical Physics, 3rd 
ed. Cambridge Univ. Press, Cambridge, UK. 

Kufner, A., and Kadlec, J. (1971). Fourier Series. Iliffe, London. This book is a 
clear account of Fourier series in the context of Hilbert space. 

Lanczos, C. (1956). Applied Analysis. Prentice-Hall, Englewood Cliffs, NJ. 
Reprinted, Dover, New York (1988). The book gives a well-written presen¬ 
tation of the Lanczos convergence technique (which suppresses the Gibbs 
phenomenon oscillations). This and several other topics are presented 
from the point of view of a mathematician who wants useful numerical 
results and not just abstract existence theorems. 

Oberhettinger, F. (1973). Fourier Expansions, A Collection of Formulas. Aca¬ 
demic Press, New York. 

Zygmund, A. (1988). Trigonometric Series. Cambridge Univ. Press, Cambridge, 
UK. The volume contains an extremely complete exposition, including 
relatively recent results in the realm of pure mathematics. 





Chapter 15 



Integral Transforms 


15.1 Introduction and Def in itions 


Frequently in physics we encounter pairs of functions related by an integral 
of the form 

F(a) = f f(t)K(a, t)dt. (15.1) 

J a 

The function F(a~) is called the (integral) transform of /(f) by the kernel 
K(a, f). The operation may also be described as mapping a function /(f) in 
f-space into another function F(a) in a-space. This interpretation takes on 
physical significance in the time-frequency relation of Fourier transforms, 
such as Example 15.4.4. 


Linearity 


These integral transforms are linear operators; that is, 


/' 


f 


[Ci/i(f) + c 2 f 2 (t)]K(a, t)dt 

pb r'b 

= Ci t,)dt + c 2 f 2 (t)K(a, t)dt , 

J a J a 

= C\F\{a~) + c 2 F 2 (pi), 

b nb 

cf(t)K(a,i)dt = c\ f(t)K(<x, t)dt, 


(15.2) 


(15.3) 


where ci, c 2 , c are constants, and /i(f), f 2 (t ) are functions for which the inte¬ 
gral transform is well defined. 


689 








690 


Chapter 15 Integral Transforms 


Figure 15.1 

Schematic Integral 
Transforms 



Representing our linear integral transform by the operator C, we obtain 

F = £f (15.4) 

We expect an inverse operator Cr x exists, such that * 1 

/ = £- ] F. (15.5) 

For our three Fourier transforms C~ l is given in Section 15.4. In general, the 
evaluation of the inverse transform is the main problem in using integral trans¬ 
forms. The inverse Laplace transform is discussed in Section 15.12. 

Integral transforms have many special physical applications and interpre¬ 
tations that are noted in the remainder of this chapter. The most common 
application is outlined in Fig. 15.1. Perhaps an original problem can be solved 
only with difficulty, if at all, in the original coordinates (space). It often hap¬ 
pens that the transform of the problem can be solved relatively easily. Then, 
the inverse transform returns the solution from the transform coordinates to 
the original system. Examples 15.5.2 and 15.5.4 illustrate this technique. 


15.2 Fourier Transform 


One of the most useful of the infinite number of possible transforms is the 
Fourier transform, given by 

1 r°° 

F» = — / f(t)e l0>t dt. (15.6) 

v Ztc J —oo 


Expectation is not proof, and here proof of existence is complicated because we are actually in 
an infinite-dimensional space. We shall prove existence in the special cases of interest by actual 
construction. 
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Two modifications of this form, developed in Section 15.4, are the Fourier 
cosine and Fourier sine transforms: 


[2 r°° 

F c (co ) = J — / /(£) cos cot dt, 

y X Jo 

F s (co) = J f(t)sincot dt. 


(15.7) 

(15.8) 


All these integrals exist if \j'(t)\dt < oo, a condition denoted / e 
L{— oo, oo) in the mathematical literature and meaning that the function / 
belongs to the space of absolutely integrable functions. Moreover, then 
Riemann’s lemma holds 

/ OO /»oo 

/(£) cos cot dt -> 0, / /(f) sin cot dt -*■ 0, as w oo. 

-OO J — OO 


The Fourier transform is based on the kernel e"" / and its real and imaginary 
parts taken separately, cos cot and sin cot. Because these kernels are the func¬ 
tions used to describe waves, due to their periodicity, Fourier transforms 
appear frequently in studies of waves and the extraction of information from 
waves, particularly when phase information is involved. The output of a stel¬ 
lar interferometer, for instance, involves a Fourier transform of the bright¬ 
ness across a stellar disk. The electron charge distribution in an atom may 
be obtained from a Fourier transform of the amplitude of scattered X-rays. 
In quantum mechanics the physical origin of the Fourier relations of Section 
15.7 is the wave nature of matter and our description of matter in terms of 
waves. 

If we differentiate the Fourier transform 


dF(co) 

dco 




tf(t)e Uat dt, 


we see that the original function /(£) is multiplied by it. This is one way of 
generating new Fourier transforms. 

If we differentiate a cosine transform with respect to co, we are led to a 
sine transform and vice versa. Many examples are given by Titchmarsh (see 
Additional Reading). 


EXAMPLE 15.2.1 


Square Pulse Let us find the Fourier transform for the shape in Fig. 15.2: 


/(*) = 


i, 

o, 


1*1 < i, 
i*i > i, 


which is an even function of t. This is the single slit diffraction problem of 
physical optics. The slit is described by f(t). The diffraction pattern amplitude 
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Figure 15.2 
Square Pulse 


y ■ 


i-1- 

1 

1 

1 

1 

1 

1 

1 

1 

1 

1 

1 

I 

1 - 

1 

-1 

1 


is given by the Fourier transform F(o>). Starting from Eq.(15.6), 

1 r°° If 1 . 1 

F(aj) = —= / fity^dt = —= / e wt dt = — 

V^J-oo JtoJ-1 V2 


icoVLt 


r 

2 sin&> 


IT CO 


which is an even function of co. 


EXAMPLE 15.2.2 


Fourier Transform of Gaussian The Fourier transform of a Gaussian, 

(15.9) 


F(co) = —= / e~ ah2 e iait dt, 
LLt Loo 


V2n . 

can be done analytically by completing the square in the exponent, 


— a 2 t 2 + icot = —a* l t — 


ico \ 2 co 2 


2 a 2 


4a 2 ’ 


which we check by evaluating the square. Substituting this identity we obtain 

F» = ^ e -® 2 /4a 2 r e -ah* dt 
V2 n J-oo 

upon shifting the integration variable t —> t+ This is justified by an applica¬ 
tion of Cauchy’s theorem to the rectangle with vertices — T, T, T + —T + ^ 
for T —»■ oo, noting that the integrand has no singularities in this region and the 
integrals over the sides from ±T to ±T + become negligible for T -» oo. 
Finally, we rescale the integration variable as £ = at in the integral 




a 


f 


e ^d% = 


a 
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Substituting these results we find 

FM= ^ exp (-3?)' (I5 ' 10) 

again a Gaussian, but in &>-space. The smaller a is (i.e., the wider the original 
Gaussian e -cr '' is), the narrower is its Fourier transform ~e _<u / 4 “. Differenti¬ 
ating F(co), the Fourier transfonn of icoe a lia ~ is ~te~ at , etc. ■ 


Laplace Transform 


The equally important Laplace transform is related to a Fourier transform 
by replacing the frequency co with an imaginary variable and changing the 
integration interval, that is, exp(io>.r) —> exp(— sx), which will be developed 
in Sections 15.8-15.12. The Laplace transform 


pOO 

F(s) = / f(f)e~ st dt (15.11) 

Jo 

has the kernel e~ st . Clearly, the possible types of integral transforms are unlim¬ 
ited. The Laplace transform has been useful in mathematical analysis as well as 
in physics and engineering applications. The Laplace and Fourier transforms 
are by far the most used. For Mellin, Hankel, Bessel, and other transforms, see 
Additional Reading. 

If the integrand of a Fourier integral is analytic, then the integral may be 
evaluated using the residue theorem according to Eq. (7.30). See Example 7.2.3 
and Exercises 7.2.4 and 7.2.15. The same method applies to Laplace transforms. 


EXAMPLE 15.2.3 


Euler Integral as Laplace Transform If we generalize the Euler integral 
[Eq. (10.5)] to 

l j e~ st t z dt = J q e~ s XstJd(si) = 

where z is a complex parameter with 91(2) > — 1, the Laplace transfonn of the 
power t z is the inverse power s~ z ~ 1 up to the normalization factor T (2 +1 ) . ■ 


Laplace transforms will be treated in detail starting in Section 15.8. 


EXERCISES 

15.2.1 (a) Show that F(-oj') = F*(a >) is a necessary and sufficient condition 

for the Fourier transform /(1) to be real. 

(b) Show that F(~o>) = —F*(o>) is a necessary and sufficient condition 
for f(t ) to be pure imaginary. 

15.2.2 Let F(o>) be the Fourier (exponential) transform of f(t) and G(o>) the 
Fourier transform of g(t) = f(t + a). Show that 

(?(«) = e- iaM F(w). 

15.2.3 Prove the identities involved in Exercise 6.5.15. 
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15.2.4 Find the Fourier sine and cosine transforms of e . 

15.2.5 Find the Fourier exponential, sine, and cosine transforms of e~ aW cos bt 
and e -a|s| sin bt. 

15.2.6 Find the Fourier exponential, sine, and cosine transforms of 1 /(a 2 + f 2 ) m 
for n— 2, 3. 



Fourier series, such as ^^ncosirmt / L), that we studied in the previous 
chapter are sums of terms each involving a multiple nAco of a basic frequency 
co — mt/L. If we let the periodicity interval of length L -* oo, then nAco be¬ 
comes a continuous frequency variable co, and the Fourier series goes over into 
a Fourier integral A(t) = a(o>) cos cot dco for a nonperiodic function /1(h). 

This transition from Fourier series to integral is now described in more detail. 

In Chapter 14, it was shown that Fourier series are useful in representing 
certain functions over a limited range [0, 2 jt], [-L, L], and so on, if the func¬ 
tion is periodic. We now turn our attention to the problem of representing 
a nonperiodic function over the infinite range, letting L -* oo. Physically, 
this sometimes means resolving a single pulse or wave packet into sinusoidal 
waves or a temperature distribution that decays at ±oo into wave components. 

We have seen (Section 14.2) that for the interval [— L, L] the coefficients 
a n and b„ could be written as 



(15.12) 



(15.13) 


The resulting Fourier series 



nnt 


dt 



(15.14) 


or 



(15.15) 


is Eq. (14.28). However, we now let the parameter L approach infinity, trans¬ 
forming the finite interval [— L, L] into the infinite interval (— oo, oo). We set 
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Then we have 

2 °° poo 

f{pc) -> — y, A co / fit) cos cd it — x)dt (15.16) 

n n =1 

or 

2 /»oo /»oo 

fix)= — dco fit) cos co it — x)dt, (15.17) 

^ Jo J —oo 

replacing the infinite sum by the integral over co. The first term (correspond¬ 
ing to an) has been absorbed at co = 0, assuming that f°° f(t)dt exists. This 
Fourier cosine formula is valid if / is continuous at x. If / is only piecewise 
continuous, then fix) must be replaced by l[f(x+()) + fix — 0)], whichisthe 
average of the limiting values of / to the left and right of the point x. Also, inte¬ 
grals fit)dt, etc., are always understood as the limit Iim r _ >oc f T r f(t)dt. 

It must be emphasized that this Fourier integral representation of / (x) [Eq. 
(15.17)] is purely formal. It is not intended as a rigorous derivation, but it can 
be made rigorous (compare I. N. Sneddon, Fourier Transforms, Section 3.2; 
see Additional Reading). It is subject to the conditions that fix) is 

• piecewise continuous; 

• differentiable almost everywhere (of bounded variation); and 

• absolutely integrable; that is, fff, \f(x)\dx is finite. 


Inverse Fourier Transform—Exponential Form 


Our Fourier integral [Eq. (15.17)] may be put into exponential form by noting 
that because cos co (t — x) is an even function of co and sin o>(t — x) is an odd 
function of co, 


/(x) = & 


whereas 


1 r°° f 

- dco 

J—OO J — 

l r°° r 

dw i 


fit) cos co it — x)dt, 


/(f)sin<w(f — x)dt = 0. 


Adding Eqs. (15.18) and (15.19) (with a factor i), we obtain 


If 00 - If 00 
fix) = / e lw0C dco—— / fit)e lmt dt 

2 \jT J—oo v 27T J—oo 


~j2n ^ ’ —oo J-, 

or, in terms of the Fourier transform Fioo) of / (f) 


(15.18) 


(15.19) 


(15.20) 


fiF) = 



Fico) = 




e icot dt. 


(15.21) 


The variable co introduced here is an arbitrary mathematical variable. In many 
physical problems, however, t and x are time variables and then co corresponds 
to a frequency. We may then interpret Eq. (15.18) or Eq. (15.20) as a represen¬ 
tation of / ix) in terms of a distribution of infinitely long sinusoidal wave trains 
of angular frequency co in which this frequency is a continuous variable. 
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EXAMPLE 15.3.1 


Inversion of Square Pulse Using the Fourier transform in Example 15.2.1, 
the square pulse can now be inverted as follows: 


m = 


i r 

V2jt J—c 


2 = _ 
Tt CO IX 


-f 

n J-c 


sin m 


l dm. 


Splitting the integral into one over (—oo, 0) and another over (0, oo) gives 


m = - 


=-[ 
X Jo 


sin a ), 


+ e im )drn = - 




2 r 
— — 

n Jo 


smta 


cos mt dm. 


JO 

an inverse cosine transform. 

Alternatively, we can differentiate the Heaviside unit step function expres- 

du(x ) _ 


sion [using 


dx 


5(X)] 


f(t ) = u(t + 1) - u{- 1 + t) giving 

This yields 

df(t) 


df(t) 
dt 


— 8 (t + 1) — 8(— 1 + t). 


dt 


I n OO r>C 

— — e~ la>t dm / 
^^X J — OO J —C 


[5(T + 1) - 8{t' - l)]e- la,t dt! 


= -f 

2 n /_ { 


= — / e - ial (e iM - e~ lu, }dm = - 


= !f 

n J-c 


e lrot sin m dm, 


and by integrating the result j'(t) — - e Uut do>, as above. As a final 
check, Exercise 7.2.15 gives us 


m = 



smw 


l dm = 


0, \t\ > 1, 

1, ]t|<l. 


Dirac Delta Function Derivation 

If the order of integration of Eq. (15.20) is reversed, we may rewrite it as 

/<*>=/ n'>! ' / 


-r™\hf 


(15.22) 


Apparently, the quantity in curly brackets behaves as a delta function 8(t — x). 
We might take Eq. (15.22) as presenting us with a Fourier integral represen¬ 
tation of the Dirac delta function. Alternatively, we take it as a clue to a new 
derivation of the Fourier integral theorem. 

From Eq. (1.160) (shifting the singularity from t — 0 to t = x), 

/ OO 

f(t)8 n (t — x)dt, (15.23) 

-oo 


where 8 n (t — x) is a sequence defining the distribution 8(t — .r). Note that 
Eq. (15.23) assumes that /(t) is continuous at t = x. We take 8 n (t — x) to be 


8 n (t -x) = 


sin n(t — x) 
7T (t — X~) 


1 



e w(t 


(15.24) 
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using Eq. (1.156). Substituting Eq. (15.24) into Eq. (15.23), we have 



n-^oo 2lt oo 


(15.25) 


Interchanging the order of integration and then taking the limit, as n-+ oo, we 
have Eq. (15.20), the Fourier integral theorem. 

With the understanding that it belongs under an integral sign as in Eq. 
(15.22), the identification 



(15.26) 


provides a very useful Fourier integral representation of the delta func¬ 
tion. It is used to great advantage in Sections 15.6 and 15.7. 

EXERCISES 
15.3.1 Prove that 



t > 0, 
t < 0. 


This Fourier integral appears in a variety of problems in quantum me¬ 
chanics: Wentzel, Kramers, Brillouin (WKB) barrier penetration, scat¬ 
tering, time-dependent perturbation theory, and so on. 

Hint. Try contour integration. 

15.3.2 Find the Fourier transform of the triangular pulse (Fig. 15.3) 


h(l — a\t\), \t\ < 1/a, 
0 , \t\ > 1/a. 


Note. This function provides another delta sequence with h = a and 

a —»■ oo. 


15.3.3 Prove 


by choosing a suitable contour and applying the residue theorem. 



Figure 15.3 


Triangular Pulse 


/(*) 



h 


x 


-Ha 


1/a 
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15.3.4 Find the Fourier cosine, sine, and complex transforms of e ,rjr . 

15.3.5 Define a sequence 

\x\ < 1/2 n, 

\x\ > l/2n. 

[This is Eq.(1.153).] Express S n (x) as a Fourier integral and show that 
we may write 


S n (x) = 


n, 

0, 


i r°° 

<5Or) = lim S n (x ) = — e~ lkx dk. 

n-^oo Zjt J^ 


15.3.6 Using the sequence 


& nix') 


Vt r 9 2 \ 

—= exp(— nx ), 

y/TC 


show that 


1 f°° 

S(x) = — / e~ lkx dk. 

J _oo 

Note. Remember that S(x) is defined in terms of its behavior as part of 
an integrand [Section 1.14, especially Eqs. (1.151) and (1.152)]. 


15.4 Fourier Transforms—Inversion Theorem 


I 

Let us define F(m), the Fourier transform of the function /(£), by 

1 f 00 

F{co) = —— / f^e^dt. (15.27) 

V 27T J —co 


Exponential Transform 


Then from Eq. (15.20) we have the inverse relation 


1 r°° 

fit ) = —= / Fico^e-^dco. (15.28) 

\J2tt J —oo 

Note that Eqs. (15.27) and (15.28) are almost but not quite symmetrical, differ¬ 
ing only in the sign of i. 

Here two points deserve comment. First, the 1 /a/27t symmetry is a matter 
of choice, not of necessity. Many authors attach the entire l/(2:r) factor of 
Eq. (15.20) to either Eq. (15.27) or Eq. (15.28). Second, although the Fourier 
integral [Eq. (15.20)] has received much attention in the mathematics literature, 
we shall be primarily interested in the Fourier transform and its inverse. They 
are the equations with physical significance. 
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When we move the Fourier transform pair to three-dimensional space, it 
becomes 

F(k) = (2^72 / -7' Cr )e‘ k r d 3 r, (15.29) 

f (r) = ^ 3/2 f F(k)e- ikr d 3 fe. (15.30) 

The integrals are over all space. Verification, if desired, follows immediately by 
substituting the left-hand side of one equation into the integrand of the other 
equation and using the three-dimensional delta function. 2 Equation (15.30) may 
be interpreted as an expansion of a function /(r) in a continuum of plane wave 
eigenfunctions; F( k) then becomes the amplitude of the wave exp(—?'k ■ r). 


Cosine Transform 


If /( x ) is odd or even, these transforms may be expressed in a different form. 
Consider, first, f c (x) = f c (—x), even. Writing the exponential of Eq. (15.27) in 
trigonometric form, we have 


F c {(o) = 


=r 

2tc J-c 


~J2tt 
2 


/ c (f)(cos cot + ismcot)dt 


poo 

/ / c (0 cos cot dt, 

n Jo 


(15.31) 


the sin cot dependence vanishing on integration over the symmetric interval 
(—oo, oo). Similarly, since cos cot is even, Eq. (15.27) transforms to 


[2 r°° 

f c (t) = < — / F c {w) cos cotdco. (15.32) 

V n Jo 

Equations (15.31) and (15.32) are known as Fourier cosine transforms. 


EXAMPLE 15.4.1 


Evaluation of Fourier Cosine Transform Evaluate the Fourier cosine 
integral of er" : ' with a a positive constant. Integrating by parts twice, we obtain 



e “cos coxdx 


1 00 co C°° 

—e -ax cos cox -/ e -a3 'sin coxdx 

a 0 a Jo 


1 

a 


IOO 

sin cox\ 

lo 


CO 

a 



e ^ cos coxdx 


Now we combine the integral on the right-hand side with that on the left, giving 



cos coxdx 


I 

a 


2 <5(ri — rz) = <5(xi — X 2 )S(yi — y 2 )S(zi — zo) with Fourier integral <5(xi — X 2 ) = 

jk exp[ifci(a;i - x 2 )]dki, etc. 
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or 


f 


e “cos coxdx = 


„ 2 ' 



Sine Transform 


The corresponding pair of Fourier sine transforms is obtained by assuming 
that fs(x) = —f s (—x), odd, and applying the same symmetry arguments. The 
equations are 


/ 2 

- / f s (f) sin cot dt, 3 
n Jo 


(15.33) 


= F s (co) sin cot dco. (15.34) 

From the last equation we may develop the physical interpretation that /( t ) 
is being described by a continuum of sine waves. The amplitude of sin cot is 
given by ^J'2/jt F s (o>), in which F s (o>) is the Fourier sine transform of /(£). It will 
be seen that Eq. (15.34) is the integral analog of the summation [Eq. (14.23)]. 
Similar interpretations hold for the cosine and exponential cases. 


EXAMPLE 15.4.2 


Evaluation of Fourier Sine Transform Evaluate the Fourier sine integral 
of a i ^ 0) 2 with a a positive constant. The denominator has the poles co = ±ia, 
suggesting contour integration in the complex w-plane. With this in mind, we 
replace co -» —co and show that 


r°° coe la>t dco 

Jo a 2 + co 2 



u>e lo>t dco 
a 2 + co 2 


so that our Fourier sine integral becomes 


C°° co sin cot dco 

J 0 a 2 + co 2 


1 f 00 co{e iu>t - e"“ x ) 

2 i Jo a 2 + co 2 


1 r°° coe i<ot 
2 i J_ oo a 2 + co 2 ' 


For t > 0, we close the contour in the upper o;-plane by a large half-circle, 
which does not contribute to the integral as its radius goes to oo. We pick up 
the residue iae~ at /2ia at the pole co — ia and find 


r°° co sin cot dco 
Jo a 2 + co 2 



If we take Eqs. (15.27), (15.31), and (15.33) as the direct integral trans¬ 
forms, described by C in Eq. (15.4) (Section 15.1), the corresponding inverse 
transforms, £ _1 of Eq. (15.5), are given by Eqs. (15.28), (15.32), and (15.34). 


3 Note that a factor —i has been absorbed into this F s (o>). 
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Proton Charge Form Factor The charge form factor 



of a particle is defined as the Fourier transform of its charge density p, except 
for the factor (2tt)~ 3/2 ; Ge can be measured by elastically scattering electrons 
from a target of particles (H atoms for the proton) because the (so-called 
Mott) cross section of a pointlike particle is modified by the charge form factor 
squared for a particle with finite size, if magnetic scattering is neglected. This 
is a good approximation at small scattering angle 6. The momentum transfer 
q = p' — p is taken in units of h, where p is the incident electron momentum 
and p' the scattered electron momentum in the laboratory frame (rest frame 
of the proton). For elastic scattering p = |p| = |p'| = p', if recoil is neglected 
at low momentum p. Figure 15.4 shows that q = |q| = 2p sin 9/2. 

For a pointlike particle of charge Q, p(r) = Q8(r) so that the charge 
form factor 



is constant. 

At q = 0, Ge(_ 0) = / pd 3 r — Q is the total charge Q in units of the 
elementary charge |e|; for the proton Q = 1. 

In case of spherical symmetry, we use polar coordinates r, 9, cp in which 
the charge form factor takes the form 



(15.35) 



Figure 15.4 


Proton Charge Form 
Factor 



p 
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Inverting this sine transform, one can extract the charge density from the 
measured charge form factor GeOi 2 ). This is how the proton, nuclear radii, 
and sizes of atoms and molecules are measured by electron scattering. 

At small q compared to the inverse radius of the proton, we can use the 
power series for sin qr and obtain 

poo 2 poo 2 

Ge(q 2 ) = 4tt / p(r)r 2 dr - — p(r)r 4 dr H-= 1 - (r 2 > H -, 

Jo 6 Jo 6 

where the first term is the charge Q — 1 and the integral in the second term 
is the mean square radius (r 2 ) of the proton, because the density p = \js \ 2 
is given by the quark wave function i//, quarks being the constituents of the 
proton. Thus, the proton size can be extracted from the measured slope of the 
proton charge form factor, 


(r 2 > = —6 


dq 2 


q=0 


Since the proton has a finite radius of approximately 1 fm = 10 13 cm, we 
consider a spherically symmetric model (Fig. 15.5A) 

p(r) = ~[e~ r/Rl - e~ r/R ], 
r 

where R <$; R\ are finite size parameters yet to be determined. The normal¬ 
ization N follows from the charge 1 of the proton, in units of the elementary 
charge e; that is, Ge( 0) = 1. The same model applies to the charge density of the 
3 He nucleus, except for the charge <7/.;(0 ) = 2 and its larger radius, because it is 
made up of two protons and one neutron instead of three quarks for the proton. 


Figure 15.5 

Charge Density (A) 
and Form Factor (B) 
of the Proton 
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Let us start by determining N from Ge( 0) = 1 by integrating by parts as 
follows: 


poo poo 

= 4tt p(r)r 2 dr = AttN / [e~ r ^ Rl — e~ rlR \rdr 
J o Jo 

100 poo 

= 4 tt N[-rR 1 e~ r/Ri + rRe~ r/R ] + 4jtN / \R Y e~ r/R ' - Re~ r/R )dr 

lo Jo 


= 4:7zN(Rf - R 2 ), N = 


4tz(R 2 -R 2 )' 


A look at the sine transform for G/.;(q) [Eq. (15.35)] tells us that we also need 
to calculate the integral 


pOO 

/ e~ r/R sin qr dr = —Re~ rfR sin qr 

Jo 


OO poo 

+ qR e~ rlR cos qrdr 
o Jo 


= qR 


—Re r/R cos qr 


pOO 

— qR e~ r l R sin qrdr 

Jo 


which we do by integrating by parts twice. This yields the same integral on the 
right-hand side, which we combine with that on the left-hand side, so that 


r 

Jo 


e r/R sin qr dr 


< I If 


1 + q 2 R 2 ' 

Substituting this result into the Ge sine transform formula [Eq. (15.35)] yields 


A — AT pOO 

Gs(q 2 ) = - / [ e~ r/Rl - e~ rlR \ sin qrdr 

Q Jo 

1 ( R\ R 2 \ 

~ R\ - R 2 \ 1 + q 2 R i _ 1 + q 2 R 2 )' 

Note that at q = 0 this charge form factor is properly normalized to unity, 
whereas at large q it falls like q This falloff is called quark counting and 
predicted by quantum chromodynamics, the quantum field theory of the strong 
interaction that binds quarks in the proton. Our nonrelativistic model simulates 
this behavior. Now we choose R\ = 1 fm, approximately the size of the proton, 
and R = 1/4 fm; this is shown in Fig. 15.5. ■ 

Note that the Fourier cosine transforms and the Fourier sine transforms 
each involve only positive values (and zero) of the arguments. We use the parity 
of / (t ) to establish the transforms, but once the transforms are established, the 
behavior of the functions / and g for negative argument is irrelevant. In effect, 
the transform equations impose a definite parity: even for the Fourier 
cosine transform and odd for the Fourier sine transform. 
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EXAMPLE 15.4.4 


Finite Wave Train An important application of the Fourier transform is the 
resolution of a finite pulse into sinusoidal waves. Imagine that an infinite wave 
train sin <wq* is clipped by Kerr cell or saturable dye cell shutters so that 


m = 


sin co o£, 

0 , 


1*1 < 


Nn 

0)Q ’ 


1*1 > 


Nn 

O)0 


(15.36) 


This corresponds to N cycles of our original wave train (Fig. 15.6). Since f(t) 
is odd, we may use the Fourier sine transform [Eq. (15.33)] to obtain 


F s (oo) = 



sino>of shnwt dt. 


Integrating, we find our amplitude function 


(15.37) 


F s (_oo) = 



sin[(&>o — co\Nti/coo)\ 

2(a>o — of) 


sin [(«u 0 + co)(Ntt/ coo)] 
2(a>o + of) 


(15.38) 


It is of considerable interest to see how F s (of) depends on frequency. For large 
oo o and co ~ o>o, only the first term will be of any importance because of the 
denominators. It is plotted in Fig. 15.7. This is the amplitude curve for the 
single slit diffraction pattern. 

There are zeros at 

<wo — oo Aft) 12 

-= -= ± —, ± —, and so on. (15.39) 

(Wo ooq N N 

For large N, F s (of) may also be interpreted as a Dirac delta distribution, as in 
Section 1.14. Since the contributions outside the central maximum are small 
in this case, we may take 


. 00 0 


(15.40) 
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Figure 15.7 

Fourier Transform 
of Finite Wave Train 


ffsO) 



as a good measure of the spread in frequency of our wave pulse. Clearly, if N 
is large (a long pulse), the frequency spread will be small. On the other hand, if 
our pulse is clipped short (IV is small), the frequency distribution will be wider 
and the secondary maxima are more important. ■ 


EXERCISES 


15 . 4.1 The function 


m = 


i, 

o, 


m < i 
\t\ > i 


is a symmetrical finite step function. 

(a) Find the F c (a>), Fourier cosine transform of /(£). 

(b) Taking the inverse cosine transform, show that 


m = 



sin co cos cot 

- dco. 

co 


(c) From part (b) show that 



sin co cos cot 

- dco = 

co 


0, \t\ > 1, 

f, 1*1 = 1, 
f, 1*1 <1- 


15 . 4.2 Derive sine and cosine representations of S(t — x) that are comparable 
to the exponential representation [Eq. (15.26)]. 


2 

IT 




ANS. 


sin cot sin cox dco. 


cos cot cos cox dco. 
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15 . 4.3 In a resonant cavity, an electromagnetic oscillation of frequency ®o 
dies out as 

A(t) = AQe~ mtl2Q e~ imt , t > 0. 


(Take /1(f) = 0 for t < 0.) 

The parameter Q is a measure of the ratio of stored energy to energy 
loss per cycle. Calculate the frequency distribution of the oscillation, 
a*(co)a(co), where o(o>) is the Fourier transform of A(t). 

Note. The larger Q is, the sharper your resonance line will be. 


ANS. 


a*(oo)a(a>) = 


S. _I_ 

2 7T (p) - CO 0 ) 2 + (tt»o/2 Q) 2 


15 . 4.4 (a) Calculate the Fourier exponential transform of f(t) = t n er aW for 
n = 1, 2, 3. 

(b) Calculate the inverse transform by employing the calculus of 
residues (Section 7.2). 


15.5 Fourier Transform of Derivatives 


Figure 15.1 outlines the overall technique of using Fourier transforms and 
inverse transforms to solve a problem. Here, we take an initial step in solving 
a differential equation —obtaining the Fourier transform of a derivative. 

Using the exponential form, we determine that the Fourier transform of 
fit) is 


l r°° 

Fico ) = — / f(t)e u,Jt dt, 

V ZlC J — oo 


and for df(t.)/dt. 


Fiiai) = 


1 r° 

/2n J-c 


df(t) 


~J2tx J —oo ttt 

Integrating Eq. (15.42) by parts, we obtain 


e la,t dt. 


F\ (oj) = 


n ia)t 


°/2jz 


fit) 


ICO 

~J2jt J-, 


/ OO 

f(t)e imt dt. 

-OO 


(15.41) 


(15.42) 


(15.43) 


If f(t) vanishes 4 as t -> ±oo, we have 


Fi(co) = —ia>F(co)', 


(15.44) 


that is, the transform of the derivative is (— ico ) times the transform of the 
original function. This may readily be generalized to the nth derivative to yield 

F n (co) = (-i«m®), (15.45) 


4 Apart from cases such as Exercises 15.3.5 and 15.3.6, /(*) must vanish as t —>■ ±oo in order for 
the Fourier transform of f(t) to exist. 
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provided all the integrated parts of Eq. (15.43) vanish as t —> ±oo. This is the 
power of the Fourier transform, the main reason it is so useful in solving (par¬ 
tial) differential equations. The operation of differentiation in coordinate 
space has been replaced by a multiplication in co space. Such properties 
of the kernel are the key in applications of integral transforms to solving ordi¬ 
nary differential equations (ODEs) and partial differential equations (PDEs), 
developed next. 


EXAMPLE 15.5.1 


Driven Harmonic Oscillator If we substitute the Fourier integral y(t ) = 
Y(o))e l "' l <lt inlo the harmonic oscillator ODE + fl 2 ?/ = A cos(a>ot), 
where t is the time now, we obtain an algebraic equation for Y(o>) called the 
Fourier transform of our solution y(t~), 


1 C°° 


oY)(Y^ cici) = — I e lajt [8(co — aio) + 8(co + &>o)]^®j 


AT 

“2 J_ c 


„io)t 


because differentiating twice corresponds to multiplying Y(a >) by (ico) 2 , and 
we represent the driving term as a Fourier integral with the only frequencies 
±ft>o- Upon comparing integrands, valid because the integrals are over the 
same interval in the same variable co (or, more rigorously, using the inverse 
Fourier transform), we find 


Y(oj) = 2“~2 [ 5 ( w ~ + S ( co + ®o)]. 

The resulting integral, 

y(t ) = w / — 2 -^[<$(tt> - mo) + $(a> + a>o)]dco = —5 -g c os(«o 0 , 

is the steady-state and particular solution of our inhomogeneous ODE. Note 
that the assumption that the end points in the partially integrated term 
in Eq. (15.43) do not contribute eliminates solutions of the homoge¬ 
neous harmonic oscillator ODE (called transients in physics; undamped 
sin fit, cos fit solutions in our case). 

Alternatively, we Fourier transform the ODE as follows: 

1 C°° / d 2 n \ A 

—= / ( —f + fl 2 2/1 e uat dt = —— / ( e imt + e^^e^dt 

y/2jt J- 00 V dt 2 J 2V2)r Loo 

— A^/tt2[8(cd — co 0 ) + 8(co + a»o)]- 


We integrate by parts twice 
d 2 y 


J 

J — C 




dt 2 


e dt = —e 
dt 


■ r 

ICO / 

J —c 


—e ia>t dt 

dt 


ico 

ye iat 

00 /»oo 

— ico ye^dt 



— OO J —O O 


—co 



ye ia>t dt, 
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EXAMPLE 15.5.2 


EXAMPLE 15.5.3 


assuming that y(t) -» 0 and —> o sufficiently fast, as t ±oo. The result 

of comparing integrands (using the inverse Fourier transform) is the same as 
before: 

(fl 2 — co 2 )Y(co) = — A[8(to — to o) + 8(co + co o)] - 

V ^ 

Similarly, a PDE might become an ODE, such as the heat flow PDE consid¬ 
ered next. 

Heat Flow PDE To illustrate the transformation of a PDE into an ODE, let 
us Fourier transform the heat flow partial differential equation 

d\[f g 

3 1 a dx 2 ’ 

where the solution \j/ (x, t) is the temperature in space as a function of time. 
By substituting the Fourier integral solution 

i J/(x, t ) = f to, t)e~ l<ox dao, 

V 2i7T J —oo 

this yields an ODE for the Fourier transform VP of i//, 

— = -a 2 ® 2 'P(«, t), 
dt 

in the time variable t. Alternatively and equivalently, apply the inverse Fourier 
transform to each side of the heat PDE. Integrating, we obtain 

In 'P = —a 2 art + InC, or 'P = Ce _a2 " 2( , 

where the integration constant C may still depend on to and, in general, is 
determined by initial conditions. Putting this solution back into our inverse 
Fourier transform, 

i //(x, t ) = —— / C(to')e~ la>x e~ a '' a> ' Jt dto, 

\/27T J-oo 

yields a separation of the x and / variables. For simplicity, we here take 
C (^-independent (assuming appropriate initial conditions) and integrate by 
completing the square in to, as in Example 15.2.2, making appropriate changes 
of variables and parameters (a 2 -> a 2 t, to —> x, t —> —to). This yields the 
particular solution of the heat flow PDE, 

, C ( x 2 \ 

that appears as a clever guess in Section 16.2. In effect, we have shown that (/ 
is the inverse Fourier transform of C exp (—a 2 co 2 t). ■ 

Inversion of PDE Derive a Fourier integral for the Green’s function Go of 
Poisson’s PDE, which is a solution of 

V 2 G 0 (r,r') = 5(r-r'). 
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EXAMPLE 15.5.4 


Once Go is known, the general solution of Poisson’s PDE 

V 2 4> = —4jrp(r) 


of electrostatics is given as 

4>(r) = J G 0 (r, r')47rp(r')dV. 

Applying V 2 to <l> and using the PDE the Green’s function satisfies, we check 
that 

V 2 <h(r) = J V 2 G 0 (r, r')47rp(r')dV = J <5(r - r , )4jrp(r')dV = 47rp(r). 
Now we use Fourier transforms of the 8 function and Go, writing 


7 ' 


V 2 / «„(pWP ( r- r 0^- />(.—r')^ 

J 9 W (271)3-/ (27T )3 " 

Because the integrands of equal Fourier integrals must be the same (almost) 
everywhere, which follows from the inverse Fourier transform, and with 

Ve ip ( r_r/ ) = ipe ip ' <r_r) , 


this yields — P 2 <7o(p) = 1- Substituting this solution into the inverse Fourier 
transform for Go gives 


<5o(r, r') = - 


/ 


e ip-(r-r') 


d 3 p 

(27T) 3 p2 


1 

4jr|r — r'| ’ 


We can verify the last part of this result by applying V 2 to Go again and recalling 
from Chapter 1 that V 2 = —4?r5(r — r'). 

The inverse Fourier transform can be evaluated using polar coordinates 
exploiting the spherical symmetry of p 2 , similar to the charge form factor in 
Example 15.4.3 for a spherically symmetric charge density. For simplicity, we 
write R = r — r' and call 6 the angle between R and p so that 


/■ 


,'pR 


dpp 

p 2 


POO 

P 1 

p2n 

= dp 

/ e ipRcosB dcosO / 

dip 

JO 

'-1 

Jo 


2 i r r°° 

gipR cos 6 

1 

4:7T 

iR Jo 

p 

cos 6=—1 

R 

Ait r°° 

sin vR 

27T 2 


= 

-^-d(pA) = —, 



sin pi? 
P 


dp 


where 0 and (p are the angles of p, and / 0 °° = | from Example 7.2.4. 

Dividing by —(27r) 3 , we obtain Gu(Ii) — —\/(AnR), as claimed. An evaluation 
of this Fourier transform by contour integration is given in Example 16.3.2. ■ 


Wave Equation The Fourier transform technique may be used to advantage 
in handling PDEs with constant coefficients. To illustrate the technique further, 
let us derive a familiar expression of elementary physics. An infinitely long 
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string is vibrating freely. The amplitude y of the (small) vibrations satisfies the 
wave equation 


d 2 y 1 d 2 y 
dx 2 v 2 3 t 2 ’ 

We shall assume an initial condition 


(15.46) 


y(x, 0) = f{x), 


(15.47) 


where / is localized, that is, approaches zero at large x. 

Applying our Fourier transform to both sides of our PDE [Eq. (15.46)] 
means multiplying by e wx /-J2 tt and integrating over x according to 


Y{a, t) = 



(15.48) 


and using Eq. (15.43) for the second derivative. Note that the integrated part 
of and irX vanishes: The wave has not yet gone to ±oo, as it is propagating 
forward in time, and there is no source at infinity [/(±oo) = 0]. We obtain 


or 


[°° d ' 2 y( x ’ o = 1 r a2 y( x ’ o 

J- oo dx 2 V 2 J _oo 3 1 2 


(15.49) 


i-iafYia, t) 


1 3 2 Y(a, t) 
v 2 dt 2 


(15.50) 


Since no derivatives with respect to a appear, Eq. (15.50) is actually an ODE— 
in fact, it is the linear oscillator equation. This transformation, from a PDE to 
an ODE, is a significant simplification. We solve Eq. (15.50) subject to the ap¬ 
propriate initial conditions. At t — 0, applying Eq. (15.47), Eq. (15.48) reduces 
to 


Yia, 0) = 



fix)e iax dx = Fia), 


(15.51) 


where F(a) is the Fourier transform of the initial condition fix). The general 
solution of Eq. (15.50) in exponential form is 


Yia, t) = Fia)e ±ivat . 


Using the inversion formula [Eq. (15.28)], we have 

yix, t) = ^ _ f Yia, t)e~ mx da, 

V27T J—oo 

and, by Eq. (15.52), 


yix, t) = 


1 





Fia)e~ i<x ^ vt) da. 


(15.52) 


(15.53) 


(15.54) 
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Since /( x ) is the Fourier inverse transform of F(a), 

y(x, t) = f(x =f vt), 


(15.55) 


corresponding to waves advancing in the +x- and -^-directions, respectively. 

The boundary condition of Eq. (15.47) is built into these particular linear 
combinations of waves. ■ 

The accomplishment of the Fourier transform here deserves special 
emphasis. 

• The Fourier transform converts a PDE into an ODE, where the “degree of 
transcendence” of the problem is reduced. 

In Section 15.10, Laplace transforms are used to convert ODEs (with constant 
coefficients) into algebraic equations. Again, the degree of transcendence is 
reduced. The problem is simplified, as outlined in Fig. 15.1. 

EXERCISES 

15.5.1 Equation (15.45) yields 


1*2 (<u) = —(o 2 F(co) 


for the Fourier transform of the second derivative of fix'). The con¬ 
dition f(_x) —> 0 for x —*■ ±oo may be relaxed slightly. Find the least 
restrictive condition for the preceding equation for F^ioo) to hold. 


OO 



= 0 . 


— OO 


15.5.2 (a) Given that F( k) is the three-dimensional Fourier transform of fir) 

and Fi(k) is the three-dimensional Fourier transform of V/(r), 
show that 

Fii k) = (-ik)F(k). 

This is a three-dimensional generalization of Eq. (15.45) for n— 1. 
(b) Show that the three-dimensional Fourier transform of V • V/(r) is 

F 2 (k) = (—fk) 2 F(k). 

Note. Vector k is in the transform space. In Section 15.7, we shall 
have fik = p, linear momentum. 

15.5.3 Show 



by contour integration in conjunction with the residue theorem. 
Hint. Use spherical polar coordinates in fc-space. 
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15.5.4 Solve the PDE 

dy _ d 2 y 

dt dx 2 

by Fourier transform, where y(pc, t = 0) = 0, x > 0, y(x = 0, t) = 
fit), t > 0, and a is a constant. 


a 2 y 


15.5.5 Show that the three-dimensional Fourier exponential transform of 
a radially symmetric function may be rewritten as a Fourier sine 
transform: 

i r°° l [2 r 00 

WFL me ' t " d ’ x = SJ rm]siDkr * r - 


15.6 Convolution Theorem 


We employ convolutions to solve differential equations and to normalize mo¬ 
mentum wave functions. 

Let us consider two functions f{pc) and g(x) with Fourier transforms F(o>) 
and G(o>), respectively. We define the operation 

f *g= ~^= J giy)fix - y) dy (15.56) 

as the convolution of the two functions / and g over the interval (—oo, oo). 
This form of an integral appears in probability theory in the determination of 
the probability density of two random, independent variables. Our solution of 
Poisson’s equation (i.e., the Coulomb potential) may be interpreted as a convo¬ 
lution of a charge distribution, p (r 2 ), and a weighting function, (47r eo | iq — r 21 )~ 1 . 
In other works this is sometimes referred to as the Faltung, the German term 
for “folding.” 5 We now transform the integral in Eq. (15.56) by introducing the 
Fourier transforms, interchanging the order of integration, and transforming 

g(.y)- 

/ OO 

g(y)f(x - y)dy = 

-oo 


Comparing with Eq. (15.56), this shows that 

f*g = £-\FG). 


i /»oo no o 

—j= / gQy) / Fia>)e~ lm<jr ~ v) dm dy 

\ 2*71 J—oo J—oo 

-i r F(co)\ r g(y)e iwy dy 
V 27T J — oo L J —oo 


r 


Fi(D)Gico)e- lax dco. 


dm 


(15.57) 


In other words, the Fourier inverse transform of a product of Fourier trans¬ 
forms is the convolution of the original functions, f * g. 


B For f(y) = e~ y , f (y) and fix — y) are plotted in Fig. 15.8. Clearly, / ( y) and fix — y) are mirror 
images of each other in relation to the vertical line y = x/2\ that is, we could generate fix — y) 
by folding over fiy) on the line y = x/2. 
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Figure 15.8 
Convolution-Faltung 



EXAMPLE 15.6.1 


Convolution Integral Let us apply the convolution Eq. (15.57) with /, F 
from Example 15.2.2 and g, G from Example 15.4.1 so that 


f(pc) = e 


a(y) = 


F(oj) = 


-b 12/1 


ay /2 

G{co) = 


exp 


CO 

1a 2 )’ 


b 2 


■ co 


,2' 


From Example 15.4.1, recall that 


f°° e iloy dt 

J-oo b 2 + co 2 Jo 


cos coy dco n 

b 2 + co 2 b 


-by 


y> o, 


using the Euler identity e'"‘ y = cos coy+i sin coy and noticing that the sine inte¬ 
gral vanishes because its integrand is odd under reversal of sign of t, whereas 
the cosine integrand is even. 

Now we apply the convolution formula [Eq. (15.57)] 


2 IW 
by 2 




exp (—a 2 Or — y) 2 )dy = 



g—i(DX 

b 2 + w 26 


~dco. 


The integral 1 can be manipulated into the error integral erfc (Section 10.4) 
by splitting the interval and substituting y -> — y in the (—oo, 0) part, giving 


e 6,2/1 exp(-a 2 0r- yf)dy 


-f 

rOO rOO 

= / e~ by exp(-a 2 (x - yf)dy+ / e~ by exp(-a 2 (x+ yf)dy. 
J o Jo 
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EXAMPLE 15.6.2 


Now we substitute £ = y — x in the first integral and £ = y + x in the 
second, yielding 


1 = e 


-bx 


f 


e~ b ^ 2 d^ + e' 


bx 


f 




Completing the square in the exponent as in Example 15.2.2 using 
a 2 f + b^ = a 2 (^ + ^] 


2 a 2 , 

we obtain, with the substitution ay = £ + b/2a 2 , 

1 


I = 1 e -bx+b 2 /ia 2 f 

a J 


pOO 

e-^dy + —gbx+b 2 /4a 2 / e -^ dr] 

—ax+b/2a ® Jax+b/2a 


so that finally 


-1 = 


Tt 


0 h 2 /4« 2 


2 b 2 2abV2 


„—bx , 


2 a 


erfcl -—aa;) +e bar erfc{ —+ax 


2 a 


(15.58) 


Another example is provided by changing g(y) in the previous example to 
the square pulse g(y) — 1, tor \ y\ < land zero elsewhere. Its Fourier transform 
is given in Example 15.2.1 as 


G(co) = 


i r 

V2tt J-c 


, , , ,2 sino> 

g(y)e v dy= J -- 


7T 00 


The convolution with fix') takes the interesting form 




exp(— d 2 ix — yf) d/y — 


1 f °° e -a> 2 /ia 2 ® in m n -Uox 

ajn J-oo 


e~ la,x dco, 


where the left-hand side can again be converted to a difference of error 
integrals 


/: 


exp(— a 2 (x — y) 2 )dy = ^^[erfc(—a(l + x)) — erfc(a(l — a))]. 
2a 


Coulomb Potential by Convolution The Coulomb potential for an ex¬ 
tended charge distribution p of a composite system, 



P(Q 
|r — r'| 


d 3 r 


appears to be a three-dimensional case of a convolution integral. If we recall 
the charge form factor G/.- as the Fourier transform of the charge density from 
Example 15.4.3, 


1 

(2tt) 3 / 2 


J p(r)e ipr d 3 r = 


Ge( P 2 ) 

(2tt) 3 / 2 ’ 
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and 1/p 2 as the Fourier transform of l/|r — r'| from Example 15.5.3, 


( 2 ? xf' 2 


47 r|r — r' 


W 


gip-(r-r') 

(2tt) 3 / 2 p 2 


d 3 p, 


being careful to include all normalizations, then we can apply the convolution 
theorem to obtain 


J |r-1- J p‘ 


-e 


d 3 p 


f " ( 2 jtf' 

Let us now evaluate this result for the proton Example 15.4.3. This gives 

R 2 R 2 \ e~ ipr d 3 p 


(15.59) 


V(r) = 


R 


x e 


An C ( R\ R 2 \ e _ip r d 3 p 

j-R 2 J V1 + &R i ~~ 1 + P 2 R 2 ) (2 nf 

** (\( M | M /? 2 r ( 1 

f - R 2 J L\ 1 + P 2 ^i P 2 ) 1 VP 2 1 


R 2 


p 2 R 2 


R 2 


. d 3 p 

( 2 R? 

,2 / 1 e~ rlRl 


Rl — R 2 \r 


R 2 


1 


R 2 - R 2 \r 


o~ r ! R 


1 


1 


R 2 -R 2 


-(R 2 e~ r/Rl - R 2 e~ r/R ) 


for the electrostatic potential, a pointlike Coulomb potential combined with a 
Yukawa shape which remains finite as r —> 0. ■ 


Parseval’s Relation 


Results analogous to Eq. (15.57) may be derived for the Fourier sine and cosine 
transforms (Exercises 15.6.1 and 15.6.2). 

For the special case x = 0 in Eq. (15.57), we have 


r 




F(c 0 )G(co)d(o = / f(-y)g{y)dy. 


(15.60) 


Equation (15.60) and the corresponding sine and cosine convolutions are of¬ 
ten called Parseval’s relations by analogy with Parseval’s theorem for Fourier 
series (Chapter 14, Exercise 14.4.2). However, the minus sign in — y suggests 
that modifications be tried. We now do this with g* instead of g using a different 
technique. 

The Parseval relation 6,7 

/ OO POO 

F{oS)G\co)d(o= / f(f)g*(t)dt (15.61) 

-OO J —OO 


®Note that all arguments are positive, in contrast to Eq. (15.60). 

7 Some authors prefer to restrict Parseval’s name to series and refer to Eq. (15.61) as Rayleigh’s 
theorem. 
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EXAMPLE 15.6.3 


may be derived elegantly using the Dirac delta function representation [Eq. 
(15.26)]. We have 

/ °° r°° l r°° l c°° 

f(t)g*(t)dt — I F(co)e~ i0}t dco ■ —— / G*{x)e ioct dx dt, 

-oo J —oo v 2u7T J —oo v ZtZ J —oo 


(15.62) 


with attention to the complex conjugation in the G*(x) to g*(t) transfonu. 
Integrating over t first and using Eq. (15.26), we obtain 


f 


f(t)g*(t) dt = 


/ oo pc 

F(g)) / 
-OO J —C 

f 


G*(x)S(x — co)dx dco 


F(co) G * (o>yio ), 


(15.63) 


our desired Parseval relation. (The * of complex conjugation can also be ap¬ 
plied to / and F instead.) If /(f) = g(t), then the integrals in the Parseval 
relation are normalization integrals (Section 9.4). Equation (15.63) guarantees 
that if a function /(f) is normalized to unity, its transform F(a>) is likewise 
normalized to unity. This is extremely important in quantum mechanics, as 
discussed in the next section. 

It may be shown that the Fourier transform is a unitary operation (in the 
Hilbert space L 2 of square integrable functions). The Parseval relation is a 
reflection of this unitary property—analogous to Exercise 3.4.14 for matrices. 

In Fraunhofer diffraction optics, the diffraction pattern (amplitude) ap¬ 
pears as the transform of the function describing the aperture (compare 
Example 15.2.1). With intensity proportional to the square of the amplitude, the 
Parseval relation implies that the energy passing through the aperture seems 
to be somewhere in the diffraction pattern—a statement of the conservation 
of energy. 

Parseval’s relations may be developed independently of the inverse Fourier 
transform and then used rigorously to derive the inverse transform. Details are 
given by Morse and Feshbach, 8 Section 4.8 (see also Exercise 15.6.3). 


Integral by Parseval’s Relation Evaluate the integral • We start 

by recalling from Example 15.4.1 that 


V2)r J-oo a 2 - 1 - "’ 2 


— f 

/2jt J 

>plj 

/. 


r-oo 2 r°° cos cox dco 

■ Jn Jo a2 + ® 2 

Next we apply Parseval’s relation to 
dco n 

-oo (« 2 + « 2 ) 2 2a 2 


f 


x > 0. 


e 


-2a\x\ j- 


7T 


dx = 

a 2 jo 


f 


e~ 2ax dx 


^ „—2 ax 

2a> 


7t 

2a 8 ' 


8 Morse, P. M., and Feshbach, H. (1953). Methods of Theoretical Physics. McGraw-Hill, New York. 
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EXERCISES 

15.6.1 Work out the convolution equation corresponding to Eq. (15.57) for 
(a) Fourier sine transforms 



where / and g are odd functions. 

(b) Fourier cosine transforms 

Y poo p CO 

o/ 9 (.y)[f(.y + x) + fQx-y\)]dy= F c (t)G c (t) cos txdt, 

^ Jo Jo 

where / and g are even functions. 

15.6.2 Show that for both Fourier sine and Fourier cosine transforms 
Parseval’s relation has the form 


F(t)G(t')dt= / f(y)g(y)dy. 


o Jo 


15.6.3 Starting from Parseval’s relation [Eq. (15.61)], let giy) = 1, 0 < y < a, 
and zero elsewhere. From this derive the Fourier inverse transform 
[Eq. (15.28)]. 

Hint. Differentiate with respect to a. 

15.6.4 Solve Poisson’s equation V 2 f (r) = —p(r)/e 0 by the following se¬ 
quence of operations: 

(a) Take the Fourier transform of both sides of this equation. Solve for 
the Fourier transform of fir). 

(b) Carry out the Fourier inverse transform by using a three-dimen¬ 
sional analog of the convolution theorem [Eq. (15.57)]. 

15.6.5 With F(ai) and G(co) the Fourier transforms of f(t) and g(t), respec¬ 
tively, show that 



\f(t)-g(t)\ 2 dt= / \F(co)~ G{co)\ 2 dm. 


If g{f) is an approximation to fit), the preceding relation indicates 
that the mean square deviation in w-spacc is equal to the mean square 
deviation in /-space. 

15.6.6 Use the Parseval relation to evaluate ft+ff ■ 

Hint. Compare Example 15.4.2. 


TC 


2 a 


ANS. 
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In advanced mechanics and in quantum mechanics, linear momentum and 
spatial position occur on an equal footing. In this section, we start with the 
usual space distribution and derive the corresponding momentum distribu¬ 
tion. For the one-dimensional case, our wave function (/ (a:), a solution of the 
Schrodinger wave equation, has the following properties: 

1. i//*(x)\l/(x)dx is the probability of finding the quantum particle between x 
and x + da: and 



2 . 


(15.64) 


corresponding to one particle (along the x-axis). 
In addition, we have 



3. 


(15.65) 


for the average position of the particle along the ,r-axis. This is often called 
an expectation value. 

We want a function g(p) that will give the same information about the 
momentum. 

1. g*(p)g(p)dp is the probability that our quantum particle has a momentum 
between p and p + dp. 



2 . 


(15.66) 



3. 


(15.67) 


As subsequently shown, such a function is given by the Fourier transform of 
our space function i//(.x). Specifically, 9 



(15.68) 



(15.69) 


The corresponding three-dimensional momentum function is 


OO 



—OO 


To verify Eqs. (15.68) and (15.69), let us check properties 2 and 3. 


B The h may be avoided by using the wave number k, p = kh (and p = kft) so that 













15.7 Momentum Representation 


719 


Property 2, the normalization, is automatically satisfied as a Parseval 
relation [Eq. (15.61)]. If the space function i// (x) is normalized to unity, the 
momentum function p(p) is also normalized to unity. 

To check property 3, we must show that 

/ °° C°° h d 

g*(p)pg(p)dp = / i/'\x)- — f(x)dx, (15.70) 

-00 J-OO 4 dx 

where (h/ i)(d/dx') is the momentum operator in the space representation. We 
replace the momentum functions by Fourier transformed space functions, and 
the first integral becomes 


Now 


1 

2 ith 


Ilf 


— OO 


p e w< ^ af ^ >/h ^*(af')ijf(x)dp dod dx. 


(15.71) 


p e -ip(x-3f)/h 


d 

dx 


" () -ip(x—x)/h 

i 


(15.72) 


Substituting into Eq. (15.71) and integrating by parts, holding a/ and p constant, 
we obtain 


' //!::,/ 


a -ip{x- 


'^/ h dp 


f*(^j- — ^{x)dx J dx. (15.73) 

i dx 


Here, we assume ijr(a .;) vanishes as x —> ±oo, eliminating the integrated part. 
Again using the Dirac delta function [Eq. (15.23)], Eq. (15.73) reduces to Eq. 
(15.70) to verify our momentum representation. Note that technically we have 
employed the inverse Fourier transform in Eq. (15.68). This was chosen delib¬ 
erately to yield the proper sign in Eq. (15.73). 


EXAMPLE 15.7.1 


Hydrogen Atom The hydrogen atom ground state 10 may be described by 
the spatial wave function 


i'QO 



15.71) 


withao being the Bohr radius, h 2, /me 21 . We now have a three-dimensional wave 
function. The transform corresponding to Eq. (15.68) is 

fj(p) = (27 ^ )3/2 f ir(r)e-^ h d 3 r. (15.75) 

Substituting Eq. (15.74) into Eq. (15.75) and using 


10 For a momentum representation treatment of the hydrogen atom, l = 0 states, see Ivash, E. V. 
(1972). A momentum representation treatment of the hydrogen atom problem. Am. J. Phys. 40, 
1095. 
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we obtain the hydrogenic momentum wave function 

2 3/2 a 3 ' 2 h 5 ' 2 


(15.77) 


Such momentum functions have been found useful in problems such as 
Compton scattering from atomic electrons, the wavelength distribution of the 
scattered radiation, depending on the momentum distribution of the target 
electrons. 

The relation between the ordinary space representation and the momentum 
representation may be clarified by considering the basic commutation relations 
of quantum mechanics. We go from a classical Hamiltonian to the Schrodinger 
equation by requiring that momentum p and position x not commute. Instead, 
we require that 

[p, x] = px — xp = —ih. (15.78) 

For the multidimensional case, Eq. (15.78) is replaced by 

[Pk, xj] = —ihSkj. (15.79) 

The Schrodinger (space) representation is obtained by using 

3 

Or): p k -ih—, 

dx k 

replacing the momentum by a partial space derivative. We see that 

[p, x]\js(x) — —ihx/iQx). (15.80) 

However, Eq. (15.78) can equally well be satisfied by using 

3 

(»): Xj —> ih -. 

J 3 Pj 

This is the momentum representation. Then 

[p, x]g(p ) = —ihg(p). (15.81) 

Hence, the representation ( x ) is not unique; (p) is an alternate possibility. 

In general, the Schrodinger representation ( x ) leading to the Schrodinger 
equation is more convenient because the potential energy V is generally given 
as a function of position V (x, y, z). The momentum representation (p) usu¬ 
ally leads to an integral equation. For an exception, consider the harmonic 
oscillator. 


EXAMPLE 15.7.2 


Simple Harmonic Oscillator The classical Hamiltonian (kinetic energy + 
potential energy = total energy) is 


H(P, x) 


jr 

2m 


1 


-lex 2 = E, 


(15.82) 


where k is Hooke’s law constant. 
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SUMMARY 


In the Schrodinger representation we obtain 

h 2 d 2 \lf(x) 1 o 

-- Pp + -kx 2 f{x) = Ef(x). (15.83) 

2m ax z 2 

For total energy E equal to y/(k/m)h/2, there is a solution (Section 13.1) 

r/r(x) = e -CV’S/C 2 ^ 2 . (15.84) 


The momentum representation leads to 


P , . 

ir- g(p) 

2m 


Again, for 


Trk drg(jj) 
2 dp 2 


E = J* h - 

V m2 


= Eg(p). 


the momentum wave equation (15.85) is satisfied by 

g(p) = e -P 2 /(WmS:)_ 


(15.85) 


(15.86) 

(15.87) 


Either representation, space or momentum (and an infinite number of other 
possibilities), may be used, depending on which is more convenient for the 
particular problem under consideration. 

The demonstration that g(p) is the momentum wave function correspond¬ 
ing to Eq. (15.83)—that it is the Fourier inverse transform of Eq. (15.83)—is 
left as Exercise 15.7.3. ■ 


Fourier integrals derive their importance from the momentum space represen¬ 
tation in quantum mechanics. Fourier transformation of an ODE with constant 
coefficients leads to a polynomial, and that of a PDE with constant coefficients 
converts the PDE to an ODE. 


EXERCISES 

15.7.1 A linear quantum oscillator in its ground state has a wave function 

= a- i ' 2 n- 1 ' 4 e~ x2 ' 2a2 . 

Show that the corresponding momentum function is 

g(p) = aWjr-WhrWe-**/ 2 *. 

15.7.2 The nth excited state of the linear quantum oscillator is described by 

f n (x) — a~ l/2 2~ n/2 n~ 1/4 (n\y l/2 e~ x '' /2a2 H n (x/a), 

where H n (x/a ) is the nth Hermite polynomial (Section 13.1). As an ex¬ 
tension of Exercise 15.7.1, find the momentum function corresponding 

to i fr n (_x). 
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Hint. \ls n (x ) may be represented by £” V'dOt), where £ + is the raising 
operator. 


15.7.3 A free particle in quantum mechanics is described by a plane wave 

f k (x, f) = 


Combining waves of adjacent momentum with an amplitude weighting 
factor <p(k), we form a wave packet 


/ ■__ 

(p(k)e i[kx - (hk 2 / 2 m) t ] dk. 


(a) Solve for <p(k) given that 

0 ) = e~^l 2a \ 


(b) Using the known value of </>(&), integrate to get the explicit form 
of T (x, t). Note that this wave packet diffuses or spreads out with 
time. 

e -pr 2 /2[a 2 +(ift/m)«]) 

ANS. V(X, f) = -——. 

[1 + ( iht/ma 2 )]V 2 

Note. An interesting discussion of this problem from the evolution 
operator point of view is given by S. M. Blinder, Evolution of a Gaussian 
wave-packet. Am. J. Phys. 36, 525 (1968). 


15.7.4 Find the time-dependent momentum wave function g(k, t) corres¬ 
ponding to T (x, t) of Exercise 15.7.3. Show that the momentum wave 
packet g*(k, t)g(k, t) is independent of time. 


15.7.5 The deuteron (Example 9.1.2) may be described reasonably well with 
a Hulthen wave function 

r) = A[e~ ar - e~P r ]/r, 


with A, a, and /J constants. Find g( p), the corresponding momentum 
wave function. 

Note. The Fourier transform may be rewritten as Fourier sine and 
cosine transforms or as a Laplace transform (Section 15.8). 


15.7.6 The nuclear form factor F(k ) and the charge distribution p(r) are 
three-dimensional Fourier transforms of each other: 

Fik) = ( 2 ^ / 

If the measured form factor is 

F(/c) = (27r)- 3 / 2 (^l+^ , 

find the corresponding charge distribution. 


ANS. 


P(r ) = 


a 2 e 


47r r 






15.7 Momentum Representation 


723 


15.7.7 Check the normalization of the hydrogen momentum wave function 

, , 2 3 / 2 aJ 2 h 5/2 

9 P “ * ( a 2 p 2 +h 2 ) 2 
by direct evaluation of the integral 

J g*(.v)g(.v)d 3 p. 

15.7.8 With xjr (r) a wave function in ordinary space and y(p) the correspond¬ 
ing momentum function, show that 

(a) Qnhf / 2 j r ^W e ~ iT ' Plhd3r = ihV P<P( P)> 

( b) ^372 f rV(r )e~ ir ^d 3 r = (^%)V( P )- 

IVofe. % is the gradient in momentum space: 

„ 3 . 9 „ 3 

x-h y-1- z-. 

dPx dPy dPz 

These results may be extended to any positive integer power of r 
and therefore to any (analytic) function that may be expanded as a 
Maclaurin series in r. 


15.7.9 The ordinary space wave function i// (r, t) satisfies the time-dependent 
Schrodinger equation 


dir(r,t) h 2 2 , 

ih - =-V 'ilr 

d t 2m 


V(r)i/f. 


Show that the corresponding time-dependent momentum wave func¬ 
tion satisfies the analogous equation 


ih 


9<KP, 0 P 


dt 


= —(p+ V(ihV p )<p. 
2m 


Note. Assume that V(r) may be expressed by a Maclaurin series and 
use Exercise 15.7.10. V ( ih'Vp ) is the same function of the variable ihV p 
as V (r) is of the variable r. 


15.7.10 The one-dimensional, time-independent Schrodinger equation is 

h 2 d 2 \l/(x) 

—o — JY + v &mx) = 

2m dx A 

For the special case of V(x) an analytic function of x, show that the 
corresponding momentum wave equation is 

/ d \ p 2 

V ih— g(p) + —g(p) = Eg(p). 

\ dp J 2m 

Derive this momentum wave equation from the Fourier transform [Eq. 
(15.68)] and its inverse. Do not use the substitution x -» ih(d/dp) 
directly. 
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15.8 Laplace Transforms 


I 


Definition 


The Laplace transform /(s) or £ of a function F(f) is defined by 11 

pa poo 

f(s) = £ {F(t )} = lim / e~ st F(t)dt = / e~ st F(t)dt. (15.88) 
o J 0 

A few comments on the existence of the integral are in order. The infinite 
integral of F(t), 



F(t~)dt, 


need not exist. For instance, F(t) may diverge exponentially for large t. 
However, if there is some constant such that 


\e Sot F(t )| < M, 


(15.89) 


a positive constant for sufficiently large t, t > to, the Laplace transform [Eq. 
(15.88)] will exist for s > So; F(t) is said to be of exponential order. As a 
counterexample, F(t) = e l does not satisfy the condition given by Eq. (15.89) 
and is not of exponential order. £{e l } does not exist. 

The Laplace transform may also fail to exist because of a sufficiently strong 
singularity in the function F(t) as 1 —> 0; that is, 



e st t n dt 


diverges at the origin for n< — 1. The Laplace transform £ \ t n \ does not exist 
for n < —1. 

Since for two functions F(t) and G(f), for which the integrals exist, 


£ {, aF(t ) + 6G(01 = a£ {E(f)} + b£ ]G(f)}, (15.90) 

the operation denoted by £ is linear. 


EXAMPLE 15.8.1 


Elementary Functions To illustrate the Laplace transform, let us apply the 
operation to some of the elementary functions. If 


then 


F(t) = 1, t > 0, 



for s > 0. 


(15.91) 


11 This is sometimes called a one-sided Laplace transform; the integral from — oo to +00 is referred 
to as a two-sided Laplace transform. Some authors introduce an additional factor of s. This extra s 
appears to have little advantage and continually gets in the way (see Additional Reading, Jeffreys 
and Jeffreys, Section 14.13). Generally, we take s to be real and positive. It is possible to have s 
complex, provided 9i(s) > 0. 
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Again, let 


F(t) = e kt , t > 0. 


The Laplace transform becomes 

f°° 1 

C{e kt }= / e~ st e kt dt = -, for s > k, (15.92) 

Jo s-k 

where the integral is finite. Using this relation, we obtain the Laplace transform 
of certain other functions. Since 

cosh kt = - (e kl + e~ kt J, sinhfct = ~(e kt — e~ kt ), 

2 2 

we have 

1/1 1 \ s 

C {cosh/ct} = - -- + —— = - 2 —~~2 j 

2 \s — k s + kj s z — k z 

1/1 1 \ k 

C {sinh/ct} = -(---— ) = -g—-j, 

2 \s — k s + k J s z — k z 

both valid for s > k, where the integrals are finite. Because the results are 
analytic functions of s, they may be continued analytically over the complex 
s-plane. This will prove useful for the inverse Laplace transform in Section 
15.12. We use the relations 


(15.93) 


(15.94) 


cos kt = cosh ikt, sin kt = —i sinh ikt 

in Eq. (15.94), with k replaced by ik, to find that the Laplace transforms 

s 

C{coskt] = s^k?’ 

£{sin/c£} = fc , (15.95) 

S ^ rC^ 

both valid for s > 0 , where the integrals are finite. Another derivation of this 
last transform is given in the next section. Note that lim s ^o £ {sin/ct} = 1 /k. 
This suggests we assign a value of 1 / k to the Laplace transform lim s ^o / 0 °° e~ st 
sin ktdt. 

Finally, for F(t) = t n , we have 

poo 

C{t n }= / e~ st t n dt, 

Jo 

which is the factorial function. Hence, 

C ^ = ^-v s > 0, n > —1. (15.96) 

Note that in all these transforms we have the variable s in the denominator¬ 
negative powers of s. In particular, linv^ /(s) = 0. The significance of this 
point is that if /(s) involves positive powers of s, then lim. v ^ /(s) -> oo and 
no inverse transform exists. ■ 
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Inverse Transform 


There is little importance to these operations, unless we can carry out the 
inverse transform as in Fourier transforms. That is, with 


£{F(t)} = m, 

then 

{/(s)l = Fit-). 

Taken literally, this inverse transform is not unique. However, to the physicist 
and engineer the inverse operation may almost always be taken as unique in 
practical problems. 

The inverse transform can be determined in various ways. A table of trans¬ 
forms can be built up and used to carry out the inverse transformation exactly 
as a table of logarithms can be used to look them up. The preceding transforms 
constitute the beginnings of such a table. For a more complete set of Laplace 
transforms, see AMS-55, Chapter 29. Employing partial fraction expansions 
and various operational theorems, which are considered in succeeding sec¬ 
tions, facilitates use of the tables. There is some justification for suspecting 
that these tables are probably of more value in solving textbook exercises than 
in solving real-world problems. 

A general technique for C~ l will be developed in Section 15.12 by using 
the calculus of residues. The difficulties and the possibilities of a numerical 
approach—numerical inversion—are considered at the end of this section. 


Partial Fraction Expansion 


Utilization of a table of transforms (or inverse transforms) is facilitated by 
expanding /(s) in partial fractions. 

Frequently, /(s), our transform, occurs in the form g(s)/h(s), where g(s) 
and his) are polynomials with no common factors, g(s) being of lower degree 
than his). If the factors of h(s) are all linear and distinct, then by the theory of 
partial fractions, we may write 


m = 


Cl 

s — a i 


S - 02 


Cn 

S d n 


(15.97) 


where the Cj are independent of s. The eg are the roots of his). If any one of 
the roots (e.g., ai) is multiple (occurring m times), then /(s) has the form 


m = 


C\,m 

(S - CLi) m 


Cl,m— 1 
(s - ai)™- 1 


+ (15.98) 

S - a-1 ^2 s - Oi 


Finally, if one of the factors is quadratic, (.s 2 + ps + q ), the numerator, instead 
of being a simple constant, will have the form 

as + b 
s 2 + ps + q 
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EXAMPLE 15.8.2 


EXAMPLE 15.8.3 


There are various ways of determining the constants introduced. For 
instance, in Eq. (15.97) we may multiply through by (s — a, ) and obtain 

C) ;= lim (s - CM)f(s). (15.99) 

s->a» 


In elementary cases a direct solution is often the easiest. 

Partial Fraction Expansion Let 

(15100) 

We want to bring /(s) to the form 


m = 


c 

s 


as + b 
s 2 + k 2 


Putting the right side of this equation over a common denominator and equating 
like powers of s in the numerator, we obtain 


k 2 c(s 2 + k 2 ) + s(as + b ) 

s(s 2 + k 2 ) s(s 2 + k 2 ) 


(15.101) 


c + a — 0, s 2 ; b = 0, s 1 ; ck 2 = k 2 , s°. 

Solving these (s ^ 0), we have 


c = 1 , b = 0, a = — 1 , 

giving 

m=1 s -^h (15 - 102) 

and 

ZT 1 {/(«)} = 1 - cos kt (15.103) 


by Eqs. (15.91) and (15.95). ■ 


A Step Function As one application of Laplace transforms, consider the 
evaluation of 

Fm =r^ dx . (15.104) 

Jo X 


Suppose we take the Laplace transform of this definite integral, which is 
finite by virtue of the sign changes of the sine: 


C 


sin tx 

/ - dx 

Jo X 



sin tx 

- dxdt. 

x 


(15.105) 
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Figure 15.9 


F(t) = / 0 °° s ^dx, 

a Step Function 



Now interchanging the order of integration (which is justified), 12 we get 



sin txdt. 


dx = 



dx 

S 2 + X 2 


(15.106) 


by integrating by parts as in Example 15.4.1. The factor in square brackets is 
the Laplace transform of sin tx from Eq. (15.95). Hence, 


dx 1 

1 -r- ? = - tan 

Jo s 2 +x 2 s 


IT 


= 2J= /W - 


By Eq. (15.91) we carry out the inverse transformation to obtain 

t > 0, 




(15.107) 


(15.108) 


in agreement with an evaluation by the calculus of residues (Section 7.2). 
It has been assumed that t > 0 in F(t). For F(—t) we need note only that 
sin(— tx) = — sin tx, giving F{— f) = — F(t ). Finally, if t = 0, F( 0) is clearly 
zero. Therefore, 



sin tx i ir 

- dx — —[2 u(t) 

x 2 


1 ] 


§, t > 0 
0, t = 0 
-f, t < 0, 


(15.109) 


where u(t) is the Heaviside unit step function of Example 15.3.1. Note that 
/ 0 °°(sin tx/x)dx, taken as a function of t, describes a step function (Fig. 15.9), 
a step of height n at t = 0. ■ 


The technique in the preceding example was to 

• introduce a second integration—the Laplace transform; 

• reverse the order of integration and integrate; and 

• take the inverse Laplace transform. 

There are many opportunities in which this technique of reversing the order of 
integration can be applied and proved very useful. Exercise 15.8.6 is a variation 
of this. 


12 See Jeffreys and Jeffreys (1966), Chapter 1 (uniform convergence of integrals). 
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Numerical Inversion 

As an integration, the Laplace transform is a highly stable operation—stable 
in the sense that small fluctuations (or errors) in F(i) are averaged out in 
the determination of the area under a curve. Also, the weighting factor, e~ st , 
means that the behavior of F(f) at large t is effectively ignored—unless s 
is small. As a result of these two effects, a large change in F(f) at large t 
indicates a very small, perhaps insignificant, change in /(s). In contrast to 
the Laplace transform operation, going from /(s) to F(t) is highly unstable. 
A minor change in /(s) may result in a wild variation of F(t). All significant 
figures may disappear. In a matrix formulation, the matrix is ill conditioned 
with respect to inversion. 

There is no general, completely satisfactory numerical method for invert¬ 
ing Laplace transforms. However, if we are willing to restrict attention to rel¬ 
atively smooth functions, various possibilities open up. Bellman, Kalaba, and 
Lockett 13 convert the Laplace transform to a Mellin transform (x — e _i ) and 
use numerical quadrature based on shifted Legendre polynomials, P* (x) — 
P„( 1 — 2.x). The key step is analytic inversion of the resulting matrix. Krylov and 
Skoblya 14 focus on the evaluation of the Bromwich integral (Section 15.12). 
As one technique, they replace the integrand with an interpolating polynomial 
of negative powers and integrate analytically. 


EXERCISES 
15.8.1 Prove that 


lim s/(s) = lim F(t). 

s—>oo £->•+() 


Hint. Assume that F(t) can be expressed as F(t) — J^=o a nt n - 
15.8.2 Show that 


— lim C {cos xt) — S(x). 

JT s —^-0 


15.8.3 Verily that 


C 


cos at — cos bt 


b 2 — a 2 

15.8.4 Using partial fraction expansions, show that 
1 


(a) C~ l 

(b) c~ l 


(s + a)(s + ft) 
s 


( s 2 + a 2 )(s 2 + ft 2 ) 
iat 

a ^ b. 


, a 1 ± b 2 . 


g—at _ g-bt 


(s + a)(s + b ) 


b — a 

ae~ at — be~ bt 


a — b 


a ^ b. 


1:3 Bellman, R., Kalaba, R. E., and Lockett, J. A. (1966). Numerical Inversion of the Laplace Tratis- 
forms. Elsevier, New York. 

14 Krylov, V. I., and Skoblya, N. S. (1969). Handbook of Numerical Inversion of Laplace Transforms 
(D. Louvish, Trans.). Israel Program for Scientific Translations, Jerusalem. 
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15.8.5 Using partial fraction expansions, show that for a 2 ^ b 2 


(a) C - 1 


1 


(s 2 


(b) £ 


-l 


a 2 )(s 2 

o2 


b 2 ) 


1 


b 2 


sin at sin hi 


a 


(s 2 + a 2 )(s 2 + b 2 ) ) a- 2 — b 2 


{a sin at — b sin bf}. 


15.8.6 The electrostatic potential of a charged conducting disk is known to 
have the general form (circular cylindrical coordinates) 



e~ klz 'J 0 (kp-)f(k)dk, 


with f(k) unknown. At large distances (z —► oo) the potential must 
approach the Coulomb potential Q/Atxsqz. Show that 


Jim f(k) = 


q 

4jreo 


Hint. You may set p = 0 and assume a Maclaurin expansion of / (k) or, 
using e~ kz , construct a delta sequence. 


15.8.7 A function F(t) can be expanded in a power series (Maclaurin); that is, 

OO 

F(f) = Y J a nt n . 

n= 0 

Then 

pOO oo oo poo 

C{F(t)}= / e~ st y t a n t n dt = ^ a n / e~ s, t n dt. 

’'O n= 0 n= 0 •'O 

Show that /(s), the Laplace transform of F(Q, contains no powers of s 
greater than s _1 . Check your result by calculating C (<5(f)} and comment 
on this fiasco. 


15.9 Laplace Transform of Derivatives 


Perhaps the main application of Laplace transforms is in converting differential 
equations into simpler forms that may be solved more easily. It will be seen, for 
instance, that coupled differential equations with constant coefficients 
transform to simultaneous linear algebraic equations. 

Let us transform the first derivative of F(t): 



„- st dF(t) 

dt 


dt. 


Integrating by parts, we obtain 


£{F'(£)} = e- st F(f) 


OO poo 

+ s / e~ st F(t)dt 
o Jo 


= s£{U(£)}-F( 0). 


(15.110) 
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EXAMPLE 15.9.1 


Strictly speaking, F( 0) = F(+0), 15 and dF/dt is required to be at least 
piecewise continuous for 0 < t < oo. Naturally, both F(l) and its derivative 
must be such that the integrals do not diverge. Incidentally, Eq. (15.110) pro¬ 
vides another proof of Exercise 15.8.7. An extension gives 

£{F (2) (f)} = s 2 C {F(f)} - sF(+ 0) - F'{+ 0), (15.111) 


£{F w (f)} = s n C{F(t)} - s M - 1 F(+0)-F (n - 1} (+0). (15.112) 


The Laplace transform, like the Fourier transform, replaces differentiation 
with multiplication. In the following examples, ODEs become algebraic equa¬ 
tions. Here is the power and the utility of the Laplace transform. When the 
coefficients of the derivatives are not constant, Laplace transforms do not 
simplify the ODE, as a rule. 

Note how the initial conditions, F(+0), F'(-t-O), and so on, are incorporated 
into the transform. Equation (15.111) may be used to derive C {sinfcf}. We use 
the identity 


o L l 

—k sin kt = —- sinfcf. 
dt 2 

Then, applying the Laplace transform operation, we have 


(15.113) 


— k 2 C {sin/c£} = C 


d 2 

—- sin kt 
dt 2 


= s 2 C {sin/c£} — ssin(0)-sinfcf| t= o. 

dt 


(15.114) 


Since sin(0) = 0 and d/dt sin/t£|, ; 0 = k, 


verifying Eq. (15.95). 


C {sin kt) = 


k 

s 2 + k 2 ’ 


(15.115) 


Classical Harmonic Oscillator As a physical example, consider a mass to 
oscillating under the influence of an ideal spring, spring constant k. Friction is 
neglected. Then Newton’s second law becomes 

rn <l + kX(t) = 0. (15.116) 

dt - 

The initial conditions are taken to be 

X(0) = X 0 , X(G) = 0. 


Applying the Laplace transform, we obtain 


mC 


d 2 X 1 
dt 2 j 


+ kC {X(f)} = 0, 


(15.117) 


1B Zero is approached from the positive side. 
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and by use of Eq. (15.111) this becomes 

ms 2 x(s ) — msX 0 + kx(s) = 0, 

x(s) = X 0 S with a>l = —. 

+ £t)g TO 


(15.118) 

(15.119) 


From Eq. (15.95) this is seen to be the transform of cos coot, which gives 

X(t) — X 0 coscoot, (15.120) 

as expected. ■ 

Dirac Delta Function 

For use with differential equations, one further transform is helpful—the Dirac 
delta function: 16 


/•OO 

£{5(t — to)} = / e~ st 8(t — to)dt. = e~ st °, for to > 0, 
Jo 

and for to = 0, 

£{«(*)}=!, 

where, for Laplace transforms, 5(0) is interpreted as 

5(0) = lim 5(£ — to)- 

(o-»0+ 


(15.121) 


(15.122) 


(15.123) 


As an alternate method, 5(f) may be considered the limit as e -> 0 of F(f ), 
where 


0, t < 0, 

F(t) — \ e _1 , 0 < t < e, 

0, f > s. 


By direct calculation, 


£{F(f)} = 


1 - e 


es 


(15.124) 


(15.125) 


Taking the limit of the integral (instead of the integral of the limit), we have 


or Eq. (15.122) 


lim£{F(t)} = 1 

£—>0 


£{<5(t)} = l. 


This delta function is frequently called the impulse function because it is so 
useful in describing impulsive forces, that is, forces lasting only a short time. 


lfl Strictly speaking, the Dirac delta function is undefined. However, the integral over it is well 
defined. This approach is developed in Section 1.14 using delta sequences. 
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Figure 15.10 
Spring 



EXAMPLE 15.9.2 


Impulsive Force Newton’s second law for impulsive force acting on a par¬ 
ticle of mass m becomes 


d 1 2 X 

m w = Fm ’ 


(15.126) 


where P is a constant. Transforming, we obtain 


ms 2 x(s) — tosX'(O) — mX'(0) = P. (15.127) 


For aparticle starting from rest, X'(()) = 0. 1T We shall also take X(()) = 0. Then 

x(s) = (15.128) 

ms* 

and 

X(t)=-t, (15.129) 

m 

dX( t) P 

-= —, a constant. (15.130) 

dt m 

The effect of the impulse P<$(f) is to transfer (instantaneously) P units of 
linear momentum to the particle. ■ 


EXERCISES 

15.9.1 Use the expression for the transform of a second derivative to obtain 
the transform of cos kt. 

15.9.2 A mass to is attached to one end of an unstretched spring, spring con¬ 
stant k (Fig. 15.10). At time t = 0 the free end of the spring experi¬ 
ences a constant acceleration, a, away from the mass. Using Laplace 
transforms, 


1 ' This really should be X'(+0). To include the effect of the impulse, consider that the impulse will 

occur at t = s and let e —* 0. 
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(a) find the position x of to as a function of time; and 

(b) determine the limiting form of x(t) for small t. 

1 9 a 9 k 

ANS. (a) x = -at - ^(1 — cos cot), co = —, 

2 co 2 m 

aco 2 a 

(b)x=——t, cot <£ 1. 

4! 


d 


15.10 Other Properties 


Substitution 


If we replace the parameter s by s — a in the definition of the Laplace transform 
[Eq. (15.88)], we have 


poo POO 

f(s-d)= / e-( s ~ a)t F(t)dt= / e~ st e at F(t)dt 
Jo Jo 

= C{e at F(t )}. 


(15.131) 


Hence, the replacement of s with s — a corresponds to multiplying F(t) by e at 
and conversely. This result can be used to good advantage in extending our 
table of transforms. From Eq. (15.95) we find immediately that 

C{e at sin kt] = - -^; (15.132) 

(s — ay + k z 

also, 

g _ Q 

C{e at cosfcf} = -- -= - jr, s > a. (15.133) 

(s - a) z + k z 


EXAMPLE 15.10.1 


Damped Oscillator These expressions are useful when we consider an os¬ 
cillating mass with damping proportional to the velocity. Equation (15.116), 
with such damping added, becomes 


mX"(t) + bX'(t) + kXQt ) = 0, (15.134) 


where b is a proportionality constant. Let us assume that the particle starts 
from rest at X(O) = X 0 , X' (0) = 0. The transformed equation is 

m[s 2 x(s ) - sXq] + 6[sa;(s) — Jf 0 ] + kx(s ) = 0 (15.135) 


so that 


x(s) = X 0 


ms - 


ms 2 + bs + k 


(15.136) 


This may be handled by completing the square of the denominator, 

2 b k ( b \ 2 (k b 2 

s H- s H-— ( s + -— I +1--—9 

to to V 2 m) \m 4 m z , 


+ --1-0 ■ (15.137) 
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If the damping is small, b 2 < 4 km, the last term is positive and will be denoted 
by co\. Splitting the numerator of x(s) into (s + b/2m) + b/2m gives 


.. „ s + b/m 

x(s) = Xq ---^ 

(s + b/2m) 2 + co 2 

X s + b/2m (b/2mwi)co\ 

° (s + b/2m) 2 + a>\ + ° (s + b/2m) 2 + cof 
By Eqs. (15.132) and (15.133), 


(15.138) 


X(t ) = X 0 e cos wit + -sin&>i£ 

\ 2mco i 

= X 0 —e“ (b/2m)i cos(o>if - (p), (15.139) 

o>i 

where 

b k 

taxup = - -, (On = —. (15.140) 

2m,a>i m 

Of course, as b -» 0, this solution goes over to the undamped solution (Section 
15.9). ■ 


P RLC Analog 

It is worth noting the similarity between this damped simple harmonic oscilla¬ 
tion of a mass on a spring and a resistance, inductance, and capacitance (RLC) 
circuit (Fig. 15.11). At any instant the sum of the potential differences around 
the loop must be zero (Kirchhoff’s law, conservation of energy). This gives 


dl 1 r* 

L-+RI+- I dt = 0. (15.141) 

dt C J o 


Differentiating the current / with respect to time (to eliminate the integral), 
we have 



a" 

dt 


c 1 °' 


(15.142) 


Figure 15.11 
RLC Circuit 


R 


AA/V 




c 


L 
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Figure 15.12 
Translation 


m 




If we replace 1(f) with X(t), L with to, R with b, andC -1 with k, Eq. (15.142) 
is identical to the mechanical problem. It is but one example of the unification 
of diverse branches of physics by mathematics. A more complete discussion 
is provided by Olson. 18 


Translation 


This time, let f(s ) be multiplied by e ~ bs , b> 0: 


pOO 

e~ bs f(s) = e~ bs / e~ st F(t)dt 
Jo 

poo 

= / e~ s(t+b) F(t)dt. 
Jo 

Now let t + b = t; then Eq. (15.143) becomes 


poo 

e~ bs f(s) = 

Jb 



e ST F(x — b)dr 
e~ ST F(x - b)dr, 


(15.143) 


(15.144) 


if we assume that F(t) — 0 for t < 0, so that F(r — b) — 0 for 0 < r < b. In 
that case, we can extend the lower limit to zero without changing the value 
of the integral. This relation is often called the Heaviside shifting theorem 
(Fig. 15.12). We obtain 


e~ bs f(s) = C {F(t - b )}, F(t) = 0, t < 0. (15.145) 


EXAMPLE 15.10.2 


Electromagnetic Waves The electromagnetic wave equation with E — E y 
or E z , a transverse wave propagating along the .x-axis, is 


d 2 E(x, 0 _ 1 d 2 E(x, t) 
dx 2 v 2 3 1 2 


(15.146) 


ls O!son, H. F. (1943). Dynamical Analogies. Van Nostrand, New York. 
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Transforming this equation with respect to t, we get 


d 2 S 2 s _ _ 1 9E(x,t) 


—^£{E(x,t)} - ^C{E(x, t)} + — E(pc, 0) + 

dX~ V i V* V 

If we have the initial conditions E(pc, 0) = 0 and 

dE(x, t ) 


dt 


= 0. (15.147) 


t=o 


dt 


= 0 , 


t= o 


then 

{E(x, t)}= S -^C {E(x, 0} - (15.148) 

ax z 

The solution (of this ODE) is 

C {E(x, i )} = + C 2 e +(s/u;ix . (15.149) 


The “constants” Ci and c 2 are obtained by additional boundary conditions. They 
are constant with respect to x but may depend on s. If our wave remains finite 
as x -» oc, £ { E(x, t)\ will also remain finite. Hence, c 2 = 0. 

If E(0, t) is denoted by F(t), then c\ = /(s) and 

C {E{x, 0} = e- (s/v)x f(s). (15.150) 


From the translation property [Eq. (15.144)] we find immediately that 


E(x, t) = 


0 , 


t > 
t < 


(15.151) 


which is consistent with our initial condition. Differentiation and substitution 
intoEq. (15.146) verifies Eq. (15.151). Our solution represents a wave (orpulse) 
moving in the positive r-direction with velocity v. Note that for x > vt the 
region remains undisturbed; the pulse has not had time to get there. The other 
independent solution has C\ = 0 and corresponds to a signal propagated along 
the negative .r-axis. ■ 


Derivative of a Transform 


When F(t), which is at least piecewise continuous, and s are chosen so that 
e~ st F(t ) converges exponentially for large s, the integral 



e~ st F(t)dt 


is uniformly convergent and may be differentiated (under the integral sign) 
with respect to s. Then 



(-i)e- s, F(t)dt = £ {- tF(t )}. 


(15.152) 
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Continuing this process, we obtain 

/ W (s) = £{(-*m0}. (15.153) 

All the integrals so obtained will be uniformly convergent because of the de¬ 
creasing exponential behavior of e~ s, F(t). 

This same technique may be applied to generate more transforms. For 
example, 

r°° l 

C{e kt ] = / e~ st e kt dt =-, s > k. (15.154) 

Jo s-k 

Differentiating with respect to s (or with respect to k), we obtain 

£{te kt } = r * , s > k. (15.155) 

(s - ky 


Integration of Transforms 


Again, with F(t) at least piecewise continuous and x large enough, so that 
e ~ xt F(t ) decreases exponentially (as x -> oo), the integral 



e~ xt F(t)dt 


(15.156) 


is uniformly convergent with respect to x. This justifies reversing the order of 
integration in the following equation: 


b poo 


nO nO n 

/ f(x)dx = 

Js Js Jo 

-f 


e ^FfjFfdt dx 




(e-“ - e~ bt )dt, 


(15.157) 


on integrating with respect to x. The lower limit s is chosen large enough so 
that f(s) is within the region of uniform convergence. Now letting b —> oo, we 
have 


nOO nC 

/ f(x)dx = / 
Js Jo 


ntj-st 


dt = C 


F(t) 


(15.158) 


provided that F(t)/t is finite at t. = 0 or diverges less strongly than t 1 [so that 
C {F(t)/t} will exist]. 


Limits of Integration—Unit Step Function 


The actual limits of integration for the Laplace transform may be specified 
with the (Heaviside) unit step function 


u(t — fc) = 


0, t < k 
1, t > k. 
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For instance, 


r°° 1 

£ {u(t - k)} = / e~ sl dt = -e~ ks . 

Jk s 

A rectangular pulse of width k and unit height is described by F(t) = u (t) — 
u(t — k). Taking the Laplace transform, we obtain 

£ {u(t) — u(t — k)} = f e~ st dt = -(1 — e~ ks ). 

Jo s 

The unit step function is implicit in Eq. (15.144) and could also be invoked in 
Exercise 15.10.13. 


EXERCISES 


15.10.1 Solve Eq. (15.134), which describes a damped simple harmonic os¬ 
cillator for X (0) = Xq, X' (0) = 0, and 

(a) b 2 = 4km (critically damped), 

(b) b 2 > 4 k m (overdamped). 

ANS. (a) X(t) = X 0 e- C6/2m)i ( 1 + — t 

\ 2m 

15.10.2 Solve Eq. (15.134), which describes a damped simple harmonic os¬ 
cillator for X(0) = 0, X'(0) = vo, and 

(a) b 2 < 4km (underdamped), 

(b) b 2 = 4 km (critically damped), 

ANS. (a) X(t) = —e- (ft/2m) * sin ant, 
coi 

(b) X(t) = v 0 te- [b/2m)t . 

(c) b 2 > 4 k m (overdamped). 


15.10.3 The motion of a body falling in a resisting medium may be described 
by 


m 


d 2 X(t) 

dt 2 


— mg — 


dX(t) 

b - 

dt 


when the retarding force is proportional to the velocity. Find X(t) 
and dX(f)/dt, for the initial conditions 


X(0) = 


dX 

dt 


= 0. 


t =o 


15.10.4 With ./()()) expressed as a contour integral, apply the Laplace trans¬ 
form operation, reverse the order of integration, and thus show that 


£ {J 0 (f)} = (s 2 + 1)- 1/2 , for s > 0. 


15.10.5 Develop the Laplace transform of J n (f) from £ {-/n(()l by using the 
Bessel function recurrence relations. 

Hint. Here is a chance to use mathematical induction. 
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15.10.6 A calculation of the magnetic field of a circular current loop in cir¬ 
cular cylindrical coordinates leads to the integral 



Show that this integral is equal to a/( 2 r + a 2 ) 3/2 . 


15.10.7 Verify the following Laplace transforms: 



(b) £{y 0 (at)} does not exist. 


15.10.8 Develop a Laplace transform solution of Laguerre’s ODE 

tF”(t) + (1 - i)F'(t) + nF(t ) = 0. 


Note that you need a derivative of a transform and a transform of 
derivatives. Go as far as you can with n; then (and only then) set 
n = 0. 

15.10.9 Show that the Laplace transform of the Laguerre polynomial L n (at ) 
is given by 



15.10.10 Show that 


C {Ei(t )} = - ln(s +1), s > 0, 
s 


where 



E\{t) is the exponential-integral function. 
15.10.11 (a) FromEq. (15.158) show that 



provided the integrals exist. 

(b) From the preceding result show that 



in agreement with Eqs. (15.109) and (7.41). 


15.10.12 (a) Show that 
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(b) Using this result (with k = 1), prove that 
£{si(£)} = — - tan -1 s 


where 


si («) = 


-Jf 


srnr 


dx, the sine integral. 


15.10.13 If U(£) is periodic (Fig. 15.13) with a period a so that F(t + a) — F(t ) 
for all £ -' > 0. show thst 


C{F{ £)} = 


/“ e~ st F(f) dt 


1 - e~ as 

with the integration now over only the first period of F(£). 
Figure 15.13 


Periodic Function 



15.10.14 Find the Laplace transform of the square wave (period a) defined by 

1, 0 < £ < a/2 


F(t) = 


0, a/2 < £ < a. 


1 1 _ e -as/2 

ANS. /(s) =---—. 

s 1 — e~ as 


15.10.15 Show that 

s d as 2 — 2a d 

(a) £{cosha£cosa£} = ,-——, (c) £{sinha£cosa£} = 


■4a 4 ’ 


■ 4a 4 ’ 


as 2 + 2 a 3 2a 2 s 

(b) £{cosha£sina£} = — ; - —(d) £{sinha£sina£} = 


15.10.16 Show that 


(a) £ 1 {(s 2 + a 2 ) 2 } = -^-sina£— -^tcosat, 


s 4 + 4 a 4 


1 


1 


2 a 3 


2a 2 


(b) £ {s(s + a ) } = —£sina£, 

2a 

(c) £ _1 {s 2 (s 2 + a 2 ) -2 } = — sina£ + -£cosa£, 

2a 2 


(d) £ 1 {s 3 (s 2 + a 2 ) 2 } = cosa£ — -£sina£. 

Li 


s 4 + 4a 4 ' 
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Figure 15.14 

Change of Variables: 

(a) .r*/-Plane and 

(b) zt -Plane 



15.11 Convolution or Faltungs Theorem 


One of the most important properties of the Laplace transform is that given by 
the convolution or Faltungs theorem. 19 We take two transforms 

Ms) = C{F ] (t)} and / 2 (s) = C {F 2 (t)\ (15.159) 

and multiply them together. To avoid complications, when changing variables, 
we hold the upper limits finite: 

/»a /» a—x 

Ms) ■ Ms) = lim / e- s:r: F\(x)dx / e~ sv F 2 (y)dy. (15.160) 

a^ooj o Jo 

The upper limits are chosen so that the area of integration, shown in Fig. 15.14a, 
is the shaded triangle, not the square. Substituting x = t — z, y = z, the region 
of integration is mapped into the triangle, shown in Fig. 15.14b. If we integrate 
over a square in the xy- plane, we have a parallelogram in the te-planc, which 
simply adds complications. This modification is permissible because the two 
integrands are assumed to decrease exponentially. In the limit a —> oc, the 
integral over the unshaded triangle will give zero contribution. To verify the 
mapping, map the vertices: t = x+ y, z — y. Using Jacobians to transform the 
element of area (Chapter 2), we have 


dxdy = 


dx 

dy 

Jt 

dt 

dx 

dy 

dz 

dz 


dt. dz 


1 0 

-1 1 


dt dz 


(15.161) 


19 An alternate derivation employs the Bromwich integral (Section 15.12). This is Exercise 15.12.3. 
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EXAMPLE 15.11.1 


or dx dy = dt dz. With this substitution, Eq. (15.160) becomes 


/i(s) •/ 2 (s) = lim 


im [ e~ st f 
~^°° Jo Jo 

It. 


Fi{t — z)F 2 iz)dz dt 


Fiit - z)F 2 iz)dz\. 


(15.162) 


For convenience this integral is represented by the symbol 



- z)F 2 iz)dz = Fi*F 2 , 


(15.163) 


and referred to as the convolution, closely analogous to the Fourier convo¬ 
lution (Section 15.6). If we interchange fi -> f 2 and F\ -> F 2 (or replace 
z-*- t — z), we find 


F 1 * F 2 = F 2 * F h (15.164) 

showing that the relation is symmetric. 

Carrying out the inverse transform, we also find 

£ _1 {/i(s) • Ms)} = [ Fiit - z)F 2 iz)dz. (15.165) 

Jo 

This can be useful in the development of new transforms or as an alternative 
to a partial fraction expansion. One immediate application is in the solution of 
integral equations. Since the upper limit t is variable, this Laplace convolution 
is useful in treating Volterra integral equations. The Fourier convolution with 
fixed (infinite) limits would apply to Fredholm integral equations. 


Driven Oscillator with Damping As an illustration of the use of the con¬ 
volution theorem, let us return to the mass to on a spring, with damping and a 
driving force Fit). The equation of motion [Eq. (15.134)] now becomes 

mX”it) + bX’it) + kX(t) = Fit). (15.166) 


Initial conditions X(tt) = 0, X'(0) = 0 are used to simplify this illustration, and 
the transformed equation is 

ms 2 xis ) + bs xis ) + kxis ) = /(s) (15.167) 


or 


a;(s) = 


m 

TO 


1 

is + b/2m) 2 + cof’ 


(15.168) 


where u>\ = /c/to — b 2 / 4m 2 , as before. 

By the convolution theorem [Eq. (15.160) or Eq. (15.165)], and noting that 


CO i 

is + b /2m) 2 + oJ\ 


= Lie fc * /2 ™sinu)if), 









744 


Chapter 15 Integral Transforms 


we have 

1 

X(t) =- / F(t - z)e sin o) X z dz. (15.169) 

mco i J 0 

If the force is impulsive, F(t) = P<5(£), 20 

X(t)= — e“ (6/2m)i sin«ii. (15.170) 


P represents the momentum transferred by the impulse, and the constant P/m 
takes the place of an initial velocity X'(0). 

If F(t) = F 0 sin cot, Eq. (15.168) may be used, but a partial fraction expan¬ 
sion is perhaps more convenient. With 


Eq. (15.168) becomes 


x (s) = 


m = 

FqCO 1 
m s 2 + co 2 
F 0 co [" a's + b' 
m s 2 + co 2 


FqU) 


■ u> 


2 ’ 
A 


1 

(s + b/2m) 2 + cof 
c's + dl 

+ (s + b/2m) 2 + u>\ 


(15.171) 


The coefficients a', b', d, and d! are independent of s. Direct calculation shows 


1 

a' 

1 

V 





f, 




co 


2 

0 



Since the c's + d' term will lead to exponentially decreasing terms (transients, 
as shown above the denominator is the Laplace transform of e _M / 2m sin&>it), 
they will be discarded here. Carrying out the inverse operation, we find for the 
steady-state solution 


X(t) = 


_ Fo _ 

[ b 2 « 2 + TO 2 (« 2 -« 2 ) 2 ] 1/2 


sin(tt>£ — cp), 


(15.172) 


where 


tan cp = 


boo 

m(u>o — co 2 ) 


Differentiating the denominator, we find that the amplitude has a maximum 
when 


co 


co n - 


2m 2 


CO 1 


4 TO 2 ’ 


20 Note that S(t ) lies inside the interval [0, t]. 


(15.173) 
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This is the resonance condition. 21 At resonance the amplitude becomes Fo/bcoi , 
showing that the mass m goes into infinite oscillation at resonance if damp¬ 
ing is neglected (b = 0). It is worth noting that we have had three different 
characteristic frequencies: resonance for forced oscillations, with damping, 

2 2 1)2 

® 2 -" 0 -^ 2 ; 

free oscillation frequency, with damping, 

2 9 u 

Wl=W °-4^ ; 

and free oscillation frequency, no damping, 



They coincide only if the damping is zero (b = 0). 

Returning to Eqs. (15.166) and (15.167), Eq. (15.167) is our ODE for the 
response of a dynamical system to an arbitrary driving force. The final response 
clearly depends on both the driving force and the characteristics of our system. 
This dual dependence is separated in the transform space. In Eq. (15.168) the 
transform of the response (output) appears as the product of two factors, one 
describing the driving force (input) and the other describing the dynamical 
system. This latter part, which modifies the input and yields the output, is 
often called a transfer function. Specifically, f(s + b/2ni) 2 + o>f] _1 is the 
transfer function corresponding to this damped oscillator. The concept of a 
transfer function is of great use in the field of servomechanisms. Often, the 
characteristics of a particular servomechanism are described by giving its 
transfer function. The convolution theorem then yields the output signal for a 
particular input signal. ■ 


EXERCISES 


15.11.1 From the convolution theorem, show that 


-/go = £ 


J f'(u)c/3'j, 


where f{s) — C {F(t)}. 


15.11.2 Using the convolution integral, calculate 


C 


-l 


(s 2 + a 2 )(s 2 + 6 2 ) 


a 2 ^ b 2 . 


15.11.3 An undamped oscillator is driven by a force Fo sin cot. Find the dis¬ 
placement as a function of time. Notice that it is a linear combination 
of two simple harmonic motions, one with the frequency of the driving 


21 The amplitude (squared) has the typical resonance denominator, the Lorentz line shape. 
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force and one with the frequency co o of the free oscillator. (Assume 
X(0) = X'(0) = 0.) 

Fo/m f co \ 

ANS. X(t) = —-^ I — sin co 0 t — sin cot I. 

co A -co^\co o ) 


D 


15.12 Inverse Laplace Transform 


Bromwich Integral 


We now develop an expression for the inverse Laplace transform, C 1 , appear¬ 
ing in the equation 


F(t) = 


(15.174) 


One approach lies in the Fourier transform for which we know the inverse 
relation. There is a difficulty, however. Our Fourier transformable function 
had to satisfy the Dirichlet conditions. In particular, we required that 


lint G(co) = 0 (15.175) 

(x)—> OO 

so that the infinite integral would be well defined. 22 Now we wish to treat 
functions, F(t), that may diverge exponentially. To surmount this difficulty, 
we extract an exponential factor, e yt , from our (possibly) divergent Laplace 
function and write 


F(t) = e Yt G(t). 


(15.176) 


If F(t) diverges as e at , we require y to be greater than a so that G(t) will be 
convergent. Now, with G(t) = 0 for t < 0 and otherwise suitably restricted 
so that it may be represented by a Fourier transform [Eq. (15.20)], 

1 /’OO p OO 

G(f) = — / e iut du / G<iv)e~ iuv dv. (15.177) 

Ztt J_ 00 J o 

Using Eq. (15.177), we may rewrite Eq. (15.176) as 

p-yt poo poo 

F(t) = — / e iut du / F(v)e- yv e~ iuv dv. (15.178) 

J —oo J o 

Now with the change of variable, 

s — y+iu, (15.179) 


the integral over v is cast into the form of a Laplace transform 



e~ sv dv = /(s); 


(15.180) 


22 If delta functions are included, G{cS) may be a cosine. Although this does not satisfy Eq. (15.175), 
G(w) is still bounded. 
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Figure 15.15 


Singularities of 

e st f(s) 



y 


s-plane 


Possible singularities 
of e st f{s) 


s is now a complex variable and 91 (s) > y to guarantee convergence. Notice 
that the Laplace transform has mapped a function specified on the positive 
real axis onto the complex plane, 91 (s) > y. 23 

Because y is a constant, ds = idu. Substituting Eq. (15.180) into Eq. 
(15.178), we obtain 





(15.181) 


Here is our inverse transform. We have rotated the line of integration through 
90° (by using ds = idu). The path has become an infinite vertical line in the 
complex plane, the constant y having been chosen so that all the singularities 
of /(s) are to the left of the line y + is (Fig. 15.15). 

Equation (15.181), our inverse transformation, is usually known as the 
Bromwich integral, although sometimes it is referred to as the Fourier-Mellin 
theorem or Fourier-Mellin integral. This integral may now be evaluated by 
the regular methods of contour integration (Chapter 7), if there are no branch 
cuts—that is, / is a single-valued analytic function. If t > 0, the contour may 
be closed by an infinite semicircle in the left half-plane, provided the integral 
over this semicircle is negligible. Then, by the residue theorem (Section 7.2), 


F(t) = E (residues included for 9t(s) < y). 


(15.182) 


Possibly, this means of evaluation, with 91 (s) ranging through negative val¬ 
ues, seems paradoxical in view of our previous requirement that 91(s) > y. 
The paradox disappears when we recall that the requirement 9t(s) > y was 
imposed to guarantee convergence of the Laplace transform integral that de¬ 
fined f(s). Once f(s) is obtained, we may proceed to exploit its properties as 


23 For a derivation of the inverse Laplace transform using only real variables, see C. L. Bohn and 
R. W. Flynn, Real variable inversion of Laplace transforms: An application in plasma physics. Am. 
J. Phys. 46, 1250 (1978). 
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an analytical function in the complex plane wherever we choose. 24 In effect, 
we are employing analytic continuation to get C[ F(t)\ in the left half-plane ex¬ 
actly as the recurrence relation for the factorial function was used to extend 
the Euler integral definition [Eq. (10.5)] to the left half-plane. 

Perhaps two examples may clarify the evaluation of Eq. (15.182). 


EXAMPLE 15.12.1 


Inversion via Calculus of Residues 


e st m = 



If /(s) = a/(s 2 — a 2 ), then 
ae st 

(s + a)(s — a) ’ 


(15.183) 


The residues may be found by using Exercise 7.1.1 or various other means. The 
first step is to identify the singularities, the poles. Here, we have one simple 
pole at s — a and another simple pole at s — —a. By Exercise 7.1.1, the residue 
at s — a is (l)e"' and the residue at s — —a is (— g)e _ai . Then 


Residues = 



— e "') — sinh at = F(t ), 


in agreement with Eq. (15.94). ■ 


(15.184) 


EXAMPLE 15.12.2 


Another Inversion If 

l _ e ~ a * 

m =-, a > 0, 

s 

then e s 0 -< 0 grows exponentially for t < a on the semicircle in the left-hand 
s-plane so that contour integration and the residue theorem are not applicable. 
However, we can evaluate the integral explicitly as follows. We let y —»■ 0 and 
substitute s = iy so that 


1 ry+ioo i coo 

F(t) = -— ; / e st f(s)=— / [e iyt — e it( - t ~ a?> ] — . (15.185) 

ZiJT'i Jy—ioo "7T «/_oo V 

Using the Euler identity, only the sines survive that are odd in y , and we obtain 


F(f) = - 


i r°° 

" sin ty 

in 

7 -+. 

1 

^ J —oo 

. y 

y 


(15.186) 


If k > 0, /“ gives jt/ 2 and —n /2 if k < 0. As a consequence, F(t) = 0 

if t > a > 0 and if t < 0. If 0 < t < a, then F(t) = 1. This can be written 
compactly in terms of the Heaviside unit step function u(t) as follows: 


F(t .) = u(f) — u(t — a) = 


0, t < 0, 

1, 0 < t < a, 

0, t > a. 


(15.187) 


Thus, F(t~) is a step function of unit height and length a (Fig. 15.16). ■ 


24 In numerical work, /(s) may well be available only for discrete real, positive values of s. Then 
numerical procedures are indicated. See Section 15.8 and the Krylov and Skoblya reference in 
Additional Reading. 

















15.12 Inverse Laplace Transform 


749 


Figure 15.16 


Finite-Length Step 
Function u(t) — 
u(t — a) 


m 


i 


t = a 


Two general comments are in order. First, these two examples hardly begin 
to show the usefulness and power of the Bromwich integral. It is always avail¬ 
able for inverting a complicated transform, when the tables prove inadequate. 

Second, this derivation is not presented as a rigorous one. Rather, it is 
given more as a plausibility argument, although it can be made rigorous. The 
determination of the inverse transform is similar to the solution of a differential 
equation. It makes little difference how you get the solution. Guess at it if 
you want. The solution can always be checked by substitution back into the 
original differential equation. Similarly, F(t) can (and, to check for careless 
errors, should) be checked by determining whether by Eq. (15.88) 


c{F(t')} = m. 


Two alternate derivations of the Bromwich integral are the subjects of 
Exercises 15.12.1 and 15.12.2. 

As a final illustration of the use of the Laplace inverse transform, we discuss 
some results from the work of Brillouin and Sommerfeld (1914) in electromag¬ 
netic theory. 

Velocity of Electromagnetic Waves in a Dispersive Medium The group 

velocity u of traveling waves is related to the phase velocity v by the equation 




(15.188) 


where X is the wavelength. In the vicinity of an absorption line (resonance) 
dv/dX may be sufficiently negative so that u> c (Fig. 15.17). The question im¬ 
mediately arises whether a signal can be transmitted faster than c, the velocity 
of light in vacuum. This question, which assumes that such a group velocity 
is meaningful, is of fundamental importance to the theory of special relativity. 
We need a solution to the wave equation 


3 2 i/f 1 

3a: 2 i> 2 3 1 2 


(15.189) 
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Figure 15.17 


Optical Dispersion 


Anomalous region 


n{m) = 



J 


1 


Increasing wavelength, X 


Increasing frequency, v = ^ 


corresponding to a harmonic vibration starting at the origin at time zero. Our 
medium is dispersive, meaning that v is a function of the angular frequency. 
Imagine, for instance, a plane wave, angular frequency a>, incident on a shutter 
at the origin. At t = 0 the shutter is (instantaneously) opened, and the wave is 
permitted to advance along the positive ;i:-axis. 

Let us then build up a solution starting at x — 0. It is convenient to use the 
Cauchy integral formula [Eq. (6.37)], 



(for a contour encircling z — Zq in the positive sense). LTsing s = —iz and 
Zo = oo, we obtain 



t < 0, 
t > 0. 


(15.190) 


To be complete, the loop integral is along the vertical line f)i(s) = y and an 
infinite semicircle as shown in Fig. 15.18. The location of the infinite semi¬ 
circle is chosen so that the integral over it vanishes. This means a semicircle 
in the left half-plane for t > 0 and the residue is enclosed. For t < 0 we choose 
the right half-plane and no singularity is enclosed. The fact that this is the 
Bromwich integral may be verified by noting that 



(15.191) 


and applying the Laplace transform. The transformed function /(s) becomes 
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Figure 15.18 
Possible Closed Contours 



Our Cauchy-Bromwich integral provides us with the time dependence of 
a signal leaving the origin at t = 0. To include the space dependence, we note 
that 


e s(t-x/v ) 

satisfies the wave equation. With this as a clue, we replace t by t — x/v and 
write a solution 


f(x, t ) = 


1 

2 ni 


py+io o 
J y—ioo 


ps(t x/v') 

- - ^ds. 

s + ico 


(15.193) 


It was seen in the derivation of the Bromwich integral that our variable 
s replaces the co of the Fourier transformation. Hence, the wave velocity v 
becomes a function of s, that is, v(s'). Its particular form need not concern us 
here. We need only the property 


lim v (s) = c, constant. (15.194) 

|s |—>oo 


This is suggested by the asymptotic behavior of the curve on the right side of 
Fig. 15.17. 25 

Evaluating Eq. (15.193) by the calculus of residues, we may close the path 
of integration by a semicircle in the right half-plane, provided 

x 

t -<0. 

c 

Hence, 

or 

xHx, t) = 0, t -< 0, (15.195) 

c 


2B Equation (15.193) follows rigorously from the theory of anomalous dispersion and the Kronig- 
Kramers optical dispersion relations. 
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Table 15.1 Laplace Transform Operations 



Operations 

Equation No. 

Laplace transform 

f(s) = C{F(t )} = 

poo 

/ e~ st Fit ) dt 

(15.88) 

Transform of derivative 

s/(s) - F(+ 0) = 

C{F\t)} 

(15.110) 


s 2 /(s) - sF(+ 0) 

- F'{+ 0) = C{F"if}} 

(15.111) 

Transform of integral 

i/(s) = -c{ F(x)dx J 

f(s -a)= C{e at F(t~)} 

Exercise 15.11.1 

Substitution 

(15.131) 

Translation 

e~ bs m = C{F(t 

- b)uit - 6)} 

(15.145) 

Derivative of transform 

f^ts) = C{{—t) n F{t)) 

(15.153) 

Integral of transform 

poo 

J f(pc) dx = C 

m j 

(15.158) 

Convolution 

/1OO/2OO = J 

f F l {t~z)F 2 (z)dz\ 

(15.162) 

Inverse transform, Bromwich integral 

1 py+ioo 

— / e st mds = F{t ) 

LiTTX Jy—ioo 

(15.181) 


which means that the velocity of our signal cannot exceed the velocity of light 
in vacuum c. This simple but very significant result was extended by Som- 
merfeld and Brillouin to show just how the wave advanced in the dispersive 
medium. ■ 


Summary: Inversion of Laplace Transform 


• Direct use of tables and references; use of partial fractions (Section 15.8) 
and the operational theorems of Table 15.1. 

Bromwich integral [Eq. (15.181)] and the calculus of residues. 

• Numerical inversion (Section 15.8) and references. 


EXERCISES 


15.12.1 Derive the Bromwich integral from Cauchy’s integral formula. 
Hint. Apply the inverse transform C~ l to 

" Y+ia m 


1 rY+u 

— lim / 

ni o-^oo J y _ ia 


m = 

2 jti 

where f(z ) is analytic for 91(s) > y. 
15.12.2 Starting with 


s — z 


-dz, 


2jt i 

show that by introducing 


i ry+io o 

— / e st f(s)ds, 

y—200 


pOO 

f(s)= / e~ sz F(z)dz, 
Jo 
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we can convert one integral into the Fourier representation of a Dirac 
delta function. From this, derive the inverse Laplace transform. 

15.12.3 Derive the Laplace transformation convolution theorem by use of 
the Bromwich integral. 

15.12.4 Find 

£-!. _ S _ 

s 2 — k 2 

(a) by a partial fraction expansion, 

(b) repeat using the Bromwich integral. 

15.12.5 Find 


s(s 2 + k 2 ) 

(a) by using a partial fraction expansion, 

(b) repeat using the convolution theorem, 

(c) repeat using the Bromwich integral. 

ANS. F(t) = 1 — cos kt. 

15.12.6 Use the Bromwich integral to find the function whose transform is 
/(s) = s -1/2 . Note that /(s) has a branch point at s = 0. The negative 
x-axis may be taken as a cut line. 

ANS. F(t) = (jrty 1/2 . 


15.12. r Evaluate the inverse Laplace transform 

/T 1 {(s 2 -a 2 r 1/2 } 

by each of the following methods: 

(a) Expansion in a series and term-by-term inversion. 

(b) Direct evaluation of the Bromwich integral. 

(c) Change of variable in the Bromwich integral: s = (a/'2)(z + s -1 ). 

15.12.8 Show that 



where y = 0.5772..., the Euler-Mascheroni constant. 


15.12.9 Evaluate the Bromwich integral for 

f(jO — , 9 2 V 2 ' 

(s- + a 2 Y 

15.12.10 Heaviside expansion theorem. If the transform /(s) may be written 
as a ratio 


m = 


g(s) 

Ksy 
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where g(s) and h(s) are analytic functions, with Ii(s') having simple, 
isolated zeros at .s = s t , show that 


F(t) = C~ l 


ffO) ] _ \ - 9(sd Sit 
h(s) J i h'{Si) 


Hint. See Exercise 7.1.2. 


15.12.11 Using the Bromwich integral, invert /(s) = s 2 e ks . Express F(t) = 
{/(s)} in terms of the (shifted) unit step function u(t — k). 

ANS. F(t) = (f — k)u(t — k). 


15.12.12 You have the following Laplace transform: 


m = 


i 

(s + a)(s + b ) ’ 


a b. 


Invert this transform by each of three methods: 

(a) partial fractions and use of tables; 

(b) convolution theorem; and 

(c) Bromwich integral. 


ANS. 


F(t) = 


a — b 


a y^b. 


Additional Reading 


Champeney, D. C. (1973). Fourier Transforms and Their Physical Applica¬ 
tions. Academic Press, New York. Fourier transforms are developed in a 
careful, easy to follow manner. Approximately 60% of the book is devoted 
to applications of interest in physics and engineering. 

Erdelyi, A., Magnus, W., Oberhettinger, F., and Tricomi, F. G. (1954). Tables of 
Integral Transforms. McGraw-Hill, New York. This text contains extensive 
tables of Fourier sine, cosine, and exponential transforms, Laplace and 
inverse Laplace transforms, Mellin and inverse Mellin transforms, Hankel 
transforms, and other more specialized integral transforms. 

Hanna, J. R. (1990). Fourier Series and Integrals of Boundary Value Prob¬ 
lems. Wiley, Somerset, NJ. This book is a broad treatment of the Fourier 
solution of boundary value problems. The concepts of convergence and 
completeness are given careful attention. 

Jeffreys, H., and Jeffreys, B. S. (1972). Methods of Mathematical Physics, 3rd 
ed. Cambridge Univ. Press, Cambridge, UK. 

Krylov, V. I., and Skoblya, N. S. (1969). Handbook of Numerical Inversion of 
Laplace Transform. Israel Program for Scientific Translations, Jerusalem. 

Lepage, W. R. (1961). Complex Variables and the Laplace Transform for 
Engineers. McGraw-Hill, New York. Reprinted, Dover, New York (1980). A 
complex variable analysis which is carefully developed and then applied 
to Fourier and Laplace transforms. It is written to be read by students, but 
it is intended for the serious student. 

McCollum, P. A., and Brown, B. F. (1965). Laplace Transform Tables and 
Theorems. Holt, Rinehart and Winston, New York. 
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Miles, J. W. (1971). Integral Transforms in Applied Mathematics. Cambridge 
Univ. Press, Cambridge, UK. This is a brief but interesting and useful treat¬ 
ment for the advanced undergraduate. It emphasizes applications rather 
than abstract mathematical theory. 

Papoulis, A. (1962). The Fourier Integral and Its Applications. McGraw-Hill, 
New York. This is a rigorous development of Fourier and Laplace trans¬ 
forms and has extensive applications in science and engineering. 

Roberts, G. E., and Kaufman, H. (1966). Table of Laplace Transforms. Saun¬ 
ders, Philadelphia. 

Sneddon, I. N. (1951). Fourier Transforms. McGraw-Hill, New York. Reprinted, 
Dover, New York (1995). A detailed comprehensive treatment, this book is 
loaded with applications to a wide variety of fields of modern and classical 
physics. 

Sneddon, I. H. (1972). The Use of Integral Transforms. McGraw-Hill, New 
York. Written for students in science and engineering in terms they can 
understand, this book covers all the integral transforms mentioned in this 
chapter as well as in several others. Many applications are included. 

Van der Pol, B., and Brenuner, H. (1987). Operational Calculus Based on the 
Two-Sided Laplace Integral 3rd ed. Cambridge Univ. Press, Cambridge, 
UK. Here is a development based on the integral range —oo to +oo, rather 
than the useful 0 to oo. Chapter 5 contains a detailed study of the Dirac 
delta function (impulse function). 

Wolf, K. B. (1979). Integral Transforms in Science and Engineering. Plenum, 
New York. This book is a very comprehensive treatment of integral trans¬ 
forms and their applications. 

Titchmarsh, E. C. (1937). Introduction to the Theory of Fourier Integrals, 2nd 
ed. Oxford Univ. Press, New York. 




Partial Differential 
Equations 



Let us continue our discussion of partial differential equations (PDEs) by men¬ 
tioning examples of PDEs in physics. Among the most frequently encountered 
PDEs are the following: 


1. Laplace’s equation, V^//(r) = 0. 

This very common and very important equation occurs in studies of 

a. electromagnetic phenomena, including electrostatics in regions con¬ 
taining no electric charges, dielectrics, steady currents, and magneto¬ 
statics; 

b. hydrodynamics (irrotational flow of perfect fluids and surface waves); 

c. heat flow and diffusion; and 

d. gravitation in regions containing no masses. 

2. Poisson’s equation, V 1 2 3 i/^(r) = —p/s o. 

In contrast to the homogeneous Laplace equation, Poisson’s equation is 
nonhomogeneous, with a source term — p(r//so in electrostatics and (with 
Co —»■ 1) in Newtonian gravity. 

3. The wave (Helmholtz) and time-independent diffusion equations, 

V 2 i/f(r) ± k 2 \ls = 0. 

These equations also appear in such diverse phenomena as 

a. elastic waves in solids, including vibrating strings, bars, and mem¬ 
branes; 

b. sound or acoustics; 

c. electromagnetic waves; and 

d. nuclear reactors. 
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4. The diffusion or heat flow equation 


VV(t, r) = 


1 di/f 
a 2 dt 


5. The corresponding four-dimensional form in the time-dependent wave 
equation of electrodynamics involving the d’Alembertian, a four-dimen¬ 
sional analog of the Laplacian in Minkowski space (Chapters 2 and 4), 


3 M 3 nvOr) = 3 1 2 <p = 


C 2 dt 2 



q> = 0. 


6 . The time-dependent scalar potential equation, 3 2 ifr(pc) — p/sq. 

Like Poisson’s equation, this equation occurs in electrodynamics and is 
nonhomogeneous with a source term p/so- 

7. The Klein-Gordon equation, 3 2 //(a;) = —p 2 iff (x) for mass p, and the cor¬ 
responding vector equations in which the scalar function i/r is replaced by 
a vector held. Other more complicated forms are common. 

8 . The Schrodinger wave equation, 


h 2 , 3i Is 

__ V f + rv, = i*— 

and, for the time-independent case, 


h 2 2 

-V 2 i/r + Vi js = Eijs. 

2 TO 


9. The equations for elastic waves and viscous fluids and the telegraphy equa¬ 
tion containing the operators V 2 , 3 2 /3 1 2 , and 3/3 1. 

10. Maxwell’s coupled partial differential equations for electric and magnetic 
fields and Dirac’s equation for relativistic electron wave functions. For 
Maxwell’s equations, see Section 1.8. 


Some general techniques for solving second-order PDEs were discussed in 
Chapter 8 and are further discussed in this chapter: 


1. Separation of variables (Section 8.9), where the PDE is split into ordinary 
differential equations (ODEs) that are related by common constants that 
appear as eigenvalues of linear operators, Cijs — 1 1 /, usually in one vari¬ 
able. This method is closely related to symmetries of the PDE and a 
group of transformations (see Section 4.2). The Helmholtz equation, 
given as example 3 above, has this form, where the constant k 2 may arise 
by separation of the time t from the spatial variables. Likewise, in exam¬ 
ple 8, the energy E is an eigenvalue that arises in the separation of t from r 
in the Schrodinger equation. The resulting separated ODEs are discussed 
in Chapter 9 in greater detail. 

2. Conversion of a PDE into an integral equation using Green’s functions ap¬ 
plies to inhomogeneous PDEs, such as examples 2 and 7 given previously. 
An introduction to the Green’s function technique is given in Section 16.3. 





758 


Chapter 16 Partial Differential Equations 


3. Other analytical methods, such as the use of integral transforms, are de¬ 
veloped and applied in Chapter 15. 

4. Numerical calculations: The development of computers has opened up a 
wealth of possibilities based on the calculus of finite differences. 1 
Occasionally, we encounter equations of higher order. In both the theory 

of the slow motion of a viscous fluid and the theory of an elastic body, we find 
the equation 

(v 2 )V = o. 

These higher order differential equations are rare and so are not discussed 
here. 

Although not frequently encountered, and perhaps not as important as 
second-order ODEs, first-order ODEs do appear in theoretical physics and are 
sometimes intermediate steps for second-order ODEs. The solutions of some 
more important types of first-order ODEs are developed in Section 8.2. First- 
order PDEs can always be reduced to ODEs. This is a straightforward but 
lengthy process. 

Boundary Conditions 

Usually, when we know the state of a physical system at some time and the law 
governing its evolution, then we are able to predict its outcome. Such initial 
conditions are the most common boundary conditions associated with ODEs 
and PDEs. Problems in which one is finding solutions that match given points, 
curves, or surfaces are referred to as boundary value problems. Eigenfunctions 
are usually required to satisfy most boundary conditions. These boundary 
conditions may take three forms: 

• Cauchy boundary conditions: The value of a function and its normal deriva¬ 
tive are specified on the boundary. In electrostatics, this would mean tp, the 
potential, and E n , the normal component of the electric field. 

• Dirichlet boundary conditions: The value of a function is specified on the 
boundary. 

• Neumann boundary conditions: The normal derivative (normal gradient) of 
a function is specified on the boundary. In the electrostatic case, this would 
be E n , and therefore a, the surface charge density. 

A summary of the relation of these three types of boundary conditions to 
the three types of two-dimensional PDEs is given in Table 16.1. For extended 
discussions of these PDEs, consult Sommerfeld (Chapter 2) or Morse and 
Feshbach (Chapter 6) and R. Courant and D. Hilbert, Methods of Mathematical 
Physics (Interscience, New York, 1989) (see also Additional Reading). 

Parts of Table 16.1 are simply a matter of maintaining internal consistency 
or common sense. For instance, for Poisson’s equation with a closed surface, 


1 For further details of numerical computation, see R. W. Hamming’s Numerical Methods for 
Scientists and Engineers (McGraw-Hill, New York, 1973) and proceed to specialized references. 
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Table 16.1 


Boundary condition 

Elliptic 

TYPE OF PDE 
Hyperbolic 

Parabolic 

Cauchy 

Laplace, Poisson in (pc, y) 

Wave equation in (x, t) 

Diffusion equation in (x, f) 

Open surface 

Unphysical results (instability) 

Unique, stable solution 

Too restrictive 

Closed surface 

Dirichlet 

Too restrictive 

Too restrictive 

Too restrictive 

Open surface 

Insufficient 

Insufficient 

Unique, stable solution 

Closed surface 

Neumann 

Unique, stable solution 

Solution not unique 

Too restrictive 

Open surface 

Insufficient 

Insufficient 

Unique, stable solution 

Closed surface 

Unique, stable solution 

Solution not unique 

Too restrictive 


Table 16.2 

Solutions in Spherical 
Polar Coordinates® 


^ ^ ’ "I I in ! 'hn 

Z, m 


v 2 ^ = o 

V 2 i jr + k 2 \[r = 0 
V 2 f - k 2 f = 0 



“References for some of the functions are P™(cos0), to = 0, Sec¬ 
tion 11.1; m / 0, Section 11.5; Q™( cos#), irregular Legendre solution; 
jlQcr ), yi(kr ), i;(fcr) = i~ l ji(ix), and ki(kr) = —i l h^{ix). 
b cosmcp and sinmi p may be replaced by e ±lw,<1 ’. 


Dirichlet conditions lead to a unique, stable solution. Neumann conditions 
likewise lead to a unique, stable solution independent of the Dirichlet solution. 
Therefore, Cauchy boundary conditions (meaning Dirichlet plus Neumann) 
could lead to an inconsistency. 

The term boundary conditions includes the concept of initial conditions. 
For instance, specifying the initial position xo and the initial velocity i>o in some 
dynamical problem would correspond to the Cauchy boundary conditions. 
The only difference in the present usage of boundary conditions in these one¬ 
dimensional problems is that we are going to apply the conditions on both 
ends of the allowed range of the variable. 

For convenient reference, the forms of the solutions of Laplace’s equation, 
Helmholtz’s equation, and the diffusion equation for spherical polar coordi¬ 
nates are shown in Table 16.2. The solutions of Laplace’s equation in circular 
cylindrical coordinates are presented in Table 16.3. 
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“References for the radial functions are J m (cep ), Section 12.1; 
Y m (ctp), Section 12.2; I m (ap ) = and K m (ap) = 


For the Helmholtz and the diffusion equation (see Section 16.2) the constant 
±/c 2 is added to the separation constant ±a 2 to define a new parameter y 2 or 
— y 2 . For the choice +y 2 (with y 2 > 0), we get Bessel functions J m [yp) and 
Y m (yp). For the choice — y 2 (with y 2 > 0), we get modified Bessel functions 
Imiyp) and K m (yp). 

These ODEs and two generalizations of them will be examined and system¬ 
atized in the following sections. General properties following from the form of 
the differential equations are discussed in Chapter 9. The individual solutions 
are developed and applied in Chapters 11-13. 

The practicing physicist will probably meet other second-order ODEs, 
some of which may possibly be transformed into the examples studied here. 
Some of these ODEs may be solved by the techniques of Sections 8.5 and 8.6. 
Others may require a computer for a numerical solution. 


16.2 Heat Flow or Diffusion PDE 


Here, we address the full time-dependent diffusion PDE for an isotropic 
medium. Assuming isotropy is actually not much of a restriction because, 
in case we have different (constant) rates of diffusion in different directions 
(e.g., in wood), our heat flow PDE takes the form 


3 \// od 2 \[r 

- = a - n 

3 1 dx 2 


+ b 2 


3 “ 2 3 2 iA 
3 y 2 ° 3 z 2 


(16.1) 


if we put the coordinate axes along the principal directions of anisotropy. Now 
we simply rescale the coordinates using the substitutions x = a§, y = bi), 
z= ct; to get back the original isotropic form of Eq. (16.1), 


3$ _ 3 2 3> 3 2 cJ> 3 2 <b 

~dt “ + ~drF + 9 ^ 


(16.2) 


for the temperature distribution function <t>(f, rj, £, t) = i/^(f, f, f., t). 
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For simplicity, we first solve the time-dependent PDE for a homogeneous 
one-dimensional medium, for example, a long metal rod in the ^-direction: 


d\[r 9 3 2 i/ r 

Hi dx*’ 


(16.3) 


where the constant a measures the diffusivity or heat conductivity of the 
medium. We attempt to solve this linear PDE with constant coefficients with 
the relevant exponential product ansatz i/r = e ax • c/' 1 , which, when sub¬ 
stituted into Eq. (16.3), solves the PDE with the constraint [i = a 2 a 2 for the 
parameters. We seek exponentially decaying solutions for large times, that is, 
solutions with negative ft values, and therefore set a = ico, a 2 = —or for real 
oj and have 


f = e™ x e~ M = (cos cox + i sin cox)e~ a>2ah . 

Forming real linear combinations, we obtain the solution 

i/r(x, t ) = (A cos cixr + B sin wx)e~ ara2t 

for any choice of A, B, co, which are introduced to satisfy boundary conditions. 
Upon summing over multiples nco of the basic frequency for periodic boundary 
conditions or integrating over the parameter co for general (nonperiodic 
boundary conditions), we find a solution 

\j/(x, t) = J [A(o>) cos cox + B(w) sin cox]e~ a ^ t dco, (16.4) 

that is general enough to be adapted to boundary conditions at t = 0, for 
example. When the boundary condition gives a nonzero temperature i/% as 
for our rod, then the summation method applies (Fourier expansion of the 
boundary condition). If the space is unrestricted (as for an infinitely extended 
rod) the Fourier integral applies. 


• This summation or integration over parameters is one of the standard meth¬ 
ods for generalizing specific PDE solutions in order to adapt them to bound¬ 
ary conditions. 


EXAMPLE 16.2.1 


A Specific Boundary Condition Let us solve a one-dimensional case ex¬ 
plicitly, where the temperature at time t = 0 is i/)o(x) = 1 = const, in the 
interval between x = +1 and x = — 1 and zero for x > 1 and x < 1. At the 
ends, x = ±1, the temperature is always held at zero. 

For a finite interval we choose the cosQjtx/2) spatial solutions of Eq. (16.3) 
for integer l because they vanish at x = ±1. Thus, at t — 0 our solution is a 
Fourier series (see Example 8.9.1), 


OO 

\j/{pc, 0) = ^2 ai cos(7rte/2) = 1, — 1 < x < 1, 
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with coefficients (see Section 14.1) 


ai 



nix 2 jtIx 

cos-= — sin- 

2 In 2 


l 

X —— 1 


4 . In 4(—1 ) m 

— sin — =- 

nl 2 (2m + V)n ’ 


l = 2 m+ 1 ; 


ai = 0, l — 2m. 


Including its time dependence, the full solution is given by the series 


f(x, o = 4 V COS 

TT - / 


n t 2 m+ 1 

m =0 


TVX 

(2m + 1) — 


e -«((2m+l)7ra/2) 2 


(16.5) 


which converges absolutely for t > 0, but only conditionally at t = 0, as a 
result of the discontinuity at x = ±1. 

Without the restriction to zero temperature at the end points of the previous 
finite interval, the Fourier series is replaced by a Fourier integral. The general 
solution is then given by Eq. (16.4). At t = 0, the given temperature distribution 
\[r 0 = 1 gives the coefficients as (see Section 15.2) 


A(oS) = 


1 

TV 



cos coxdx = 


1 sin cox 

TV CO 


1 

oc=— 1 


2sina> 

nco 


B(co ) = 0. 


Therefore, 


i/(x, t) = 



sin w 

co 


cos (a>x)e a “‘° 2t dm. 


(16.6) 


In three dimensions the corresponding exponential ansatz \j/ = e * k r /a+0« 
leads to a solution with the relation ft — —k 2 = —k 2 for its parameter, and the 
three-dimensional form of Eq. (16.3) becomes 


d 2 \l/ d 2 "^ 

dx 2 3 y 2 


3 2 ift 2 

+k \ft = 0, 


(16.7) 


which is called the Helmholtz equation and may be solved by the separa¬ 
tion method just like the previously discussed Laplace equation in Cartesian, 
cylindrical, or spherical coordinates. 

In Cartesian coordinates, with the product ansatz of Eq. (8.114) the sepa¬ 
rated x and y ODEs from Eq. (16.3) are the same as Eqs. (8.117) and (8.120), 
whereas the z ODE [Eq. (8.121)] generalizes to 

1 d 2 Z OO 9 9 

- - = —k 2 +1 2 + to 2 = rt > 0, (16.8) 

z dz £ 

where we introduce another separation constant n 2 , constrained by 


k 2 = l 2 + m 2 - n 2 


(16.9) 
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to produce a symmetric set of equations. Now our solution of Helmholtz’s 
Eq. (16.7) is labeled according to the choice of all three separation constants 
l, to, n subject to the constraint Eq. (16.9). As before, the z ODE [Eq. (16.8)] 
yields exponentially decaying solutions The boundary condition at z = 0 

fixes the expansion coefficients a hn by the same Eq. (8.123). 

In cylindrical coordinates, we now use the separation constant l 2 for the z 
ODE with an exponentially decaying solution in mind, 



(16.10) 


so that Z ~ e~ lz because the temperature goes to zero at large z. Setting 
k 2 +l 2 = n 2 , Eqs. (8.131)-(8.132) stay the same so that we end up with the 
same Fourier-Bessel expansion [Eq.(8.137)], as before. 

In spherical coordinates, the separation method leads to the same angular 
ODEs in Eqs. (8.142) and (8.145), whereas the radial ODE now becomes 



(16.11) 


instead of Eq. (8.146), whose solutions are the spherical Bessel functions of 
Section 12.4. They are listed in Table 16.2. 

The restriction that k 2 be a constant is unnecessarily severe. The separation 
process will still work with Helmholtz’s PDE for k 2 as general as 



(16.12) 


In the hydrogen atom we have k 2 = f(r) in the Schrodinger wave equation, 
and this leads to a closed-form solution involving Laguerre polynomials. 


Biographical Data 

Helmholtz, Hermann Ludwig Ferdinand von. Helmholtz, a German 
physiologist and physicist, was bom in 1821 in Potsdam near Berlin and 
died in 1894 in Charlottenburg, now Berlin. He studied medicine in Berlin, 
graduating in 1842. In 1849, with Humboldt’s support, he was appointed pro¬ 
fessor of physiology at the University of Konigsberg in Pmssia, now Russia. 
He studied the function of the human eye and ear, defining the quality of 
a tone in terms of overtones (more rapid vibrations than the basic one). 
However, he is best known for his contributions to physics and energy con¬ 
servation in particular. He showed that the earth’s age would be less than 
25 million years if the sun’s energy came from gravitational contraction. 
He suggested to his student Heinrich Hertz that he should prove that the 
electromagnetic spectrum extends well beyond the visible light. 


| Alternate Solutions 


In a new approach to the heat flow PDE suggested by experiments, we now 
return to the one-dimensional PDE [Eq. (16.3)], seeking solutions of a new 
functional form x[r(x, t) = u(x/ J1-), which is suggested by Example 15.2.2. 
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Substituting «(£), § = x/yfi, into Eq. (16.3) using 

3 \jf v! 3 2 l/r u" d\[/ X , 

dx St' 3a? 2 t ’ 3 1 2Vfi U 

with the notation ) = ||, the PDE is reduced to the ODE 

2aV(£) + = 0. 


(16.13) 


(16.14) 


Writing this ODE as 


w" _ £ 

v! 2a 2 ’ 

!-2 

we can integrate it once to get in a! = — 4 ) j2 + In C\ with an integration constant 
Ci. Exponentiating and integrating again, we find the solution 

u($) = Ci t e~&dt; + C 2 , (16.15) 

Jo 

involving two integration constants C). Normalizing this solution at time t = 0 
to temperature +1 for x > 0 and —1 for x < 0, our boundary conditions, fixes 
the constants C) so that 


\[r = 



_f 2 

e dl; 



e V ~dv = <f> 



(16.16) 


where O denotes Gauss’s error function (see Exercise 5.10.4). See Exam¬ 
ple 15.2.2 for a derivation using a Fourier transform. We need to generalize 
this specific solution to adapt it to boundary conditions. 

To this end, we now generate new solutions of the PDE with constant 
coefficients by differentiating a special solution [Eq. (16.16)]. In other 
words, if Jf{pc, t) solves the PDE in Eq. (16.3), so do and because these 
derivatives and the differentiations of the PDE commute; that is, the order in 
which they are carried out does not matter. Note that this method no longer 
works if any coefficient of the PDE depends on t or x explicitly. However, 
PDEs with constant coefficients dominate in physics. Examples are Newton’s 
equations of motion (ODEs) in classical mechanics, the wave equations of 
electrodynamics, and Poisson’s and Laplace’s equations in electrostatics and 
gravity. Even Einstein’s nonlinear field equations of general relativity take on 
this special form in local geodesic coordinates. 

Therefore, by differentiating Eq. (16.16) with respect to x, we find the 
simpler, more basic solution 


t) = —— 
a^/tn 

and, repeating the process, another basic solution 


xl/ 2 (x, 0 = 


X 


2 a 3 V t 3 jr 


e . 


(16.17) 


(16.18) 
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Again, these solutions have to be generalized to adapt them to boundary con¬ 
ditions. Also, there is another method of generating new solutions of a PDE 
with constant coefficients: We can translate a given solution, for example, 
i/r i(x, f) — > \j/\ (x — a, f), and then integrate over the translation parame¬ 
ter a. Therefore, 

1 C °0 (x—ap‘ 

\l/(x,t)= - -= I C(ot)er 4« 2 ' da (16.19) 

2 CtyJtlC J —oo 

is again a solution, which we rewrite using the substitution 


QQ _ d 

£ = -—, a = x— 2a%y/i, da = —2 ad^y/t. (16.20) 

2 a-s/t 

Thus, we find that 

1 c°° 

t) = —— / C{x — 2a^Vt)e~^d% (16.21) 

V 77 J —oo 

is a solution of our PDE. In this form we recognize the significance of the 
weight function C(x) from the translation method because, at t — 0, i,// (x, 0) = 
C(x) = i[ro(x) is determined by the boundary condition, and er‘ q d% = Jx. 

Therefore, we can write the solution as 

/ ir 0 (x~ 2a£; d%, (16.22) 

displaying the role of the boundary condition explicitly. From Eq. (16.22) we 
see that the initial temperature distribution \l/o(x) spreads out over time and is 
damped by the Gaussian weight function. 


EXAMPLE 16.2.2 


Special Boundary Condition Again Let us express the solution of Exam¬ 
ple 16.2.1 in terms of the error function solution of Eq. (16.16). The bound¬ 
ary condition at t = 0 is \l/a(x) = 1 for —1 < x < 1 and zero for \x\ > 1. 
From Eq. (16.22) we find the limits on the integration variable f by setting 
x — 2 a^sft = ±1. This yields the integration end points £ = (±1 + x)/2a^ft. 
Therefore, our solution becomes 


1 f2£fi _s.2 

i /f(x, f) = —— / e s d%. 

y/Tt 1 

2 ay/i 

Using the error function defined in Eq. (16.16) we can also write this solution 
as follows: 


is (x, f) = 




o 


x — 1 
2 CLy/t J _ 


(16.23) 


Comparing this form of our solution with that from Example 16.2.1, we see 
that we can express Eq. (16.23) as the Fourier integral of Example 16.2.1, 
an identity that gives the Fourier integral [Eq. (16.6)], in closed form of the 
tabulated error function. ■ 
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Finally, we consider the heat flow case for an extended spherically sym¬ 
metric medium centered at the origin, which prescribes polar coordinates 
r, 9, (p. We expect a solution of the form i//(r, t) = u(r, t). Using Eq. (2.77) we 
find the PDE 


3 u 2 ( ^ 3 u\ 

3 1 a \ dr 2 r dr) ’ 


(16.24) 


which we transform to the one-dimensional heat flow PDE by the substitution 


v(r, t ) du l dv V du 1 dv 

r dr r dr r 2 ’ dt r dt’ 

d 2 u 1 d 2 v 2 dv 2v 

^y2 y Qy2 y2i Qy y3 


This yields the PDE 


dv nd 2 V 

— = or —-. 
dt dr 2 


(16.25) 


(16.26) 


EXAMPLE 16.2.3 


Spherically Symmetric Heat Flow Let us apply the one-dimensional heat 
flow PDE with the solution Eq. (16.16) to a spherically symmetric heat flow 
under fairly common boundary conditions, where x is replaced by the radial 
variable. Initially, we have zero temperature everywhere. Then, at time t — 0, a 
finite amount of heat energy Q is released at the origin, spreading evenly in all 
directions. What is the resulting spatial and temporal temperature distribution? 

Inspecting our special solution in Eq. (16.18), we see that, for t -* 0, the 
temperature 


v(r,t ) C _jt 

- = - C 4 a 2 t 

r V? 


(16.27) 


goes to zero for all r ^ 0 so that zero initial temperature is guaranteed. As 
t oo, the temperature v/r 0 for all r, including the origin, which is 
implicit in our boundary conditions. The constant C can be determined from 
energy conservation, which gives the constraint 


Q = ap f -d 3 r = 47r< ^ C f rV^dr = 8 Vj^apcfC, (16.28) 

Jr Vf3 Jo 

where p is the constant density of the medium and a its specific heat. Here, 
we have rescaled the integration variable and integrated by parts to get 


L 


e dr = 


= (2Oy/tf [ 

Jo 

r°° ,2 p £ ,2 00 if 

l = r* +i / o 




e ^ dt- — 1 . 
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SUMMARY 


The temperature, as given by Eq. (16.27) at any moment, that is at fixed t is 
a Gaussian distribution that flattens out as time increases because its width 
is proportional to *Jt. As a function of time, the temperature is proportional 
to / :i/2 e 7 7( with T = r 2 /4a 2 , which rises from zero to a maximum and then 
falls to zero again for large times. To find the maximum, we set 

—(t~ 3/2 e~ T/t ) = r 5/2 e~ T/t (- - -) = 0, (16.29) 

dt \t 2/ 

from which we find t = 2T/3. ■ 


In the case of cylindrical symmetry (in the plane z — 0 in plane polar 
coordinates p = Jx 2 + y l , (p) we search for a temperature i// = u(p, t) that 
then satisfies the ODE [using Eq. (2.21) in the diffusion equation] 


3 u 
3 1 


= a 


3 2 u 
3p 2 


1 3 u 
P dp 


(16.30) 


which is the planar analog of Eq. (16.26). This ODE also has solutions with the 
functional dependence p/\/f = r. Upon substituting 


u — v 


_P_ 

s/t 


3 u pv' 3 u v' 3 2 u v' 

3t = ~Wi 2 ' 3p = V7’ dp 2 = T ( 6 ' 3 } 


into Eq. (16.30) with the notation i/ = we find the ODE 


a 2 v" + 


a r \ : « 
- V - I v =0. 


(16.32) 


This is a first-order ODE for v ', which we can integrate when we separate the 
variables v and r as 

v" /I r \ 

V = -{r + 2 ^)- (16 ' 33) 

This yields 

C _ r 2 Jt, 

v(r)= —e = C—e ^. (16.34) 

r p 

This special solution for cylindrical symmetry can be similarly generalized 
and adapted to boundary conditions as for the spherical case. Finally, the z- 
dependence can be factored in, as s separates from the plane polar radial 
variable p. 


PDEs can be solved with initial conditions, just like ODEs, or with boundary 
conditions prescribing the value of the solution or its derivative on boundary 
surfaces, curves, or points. When the solution is prescribed on the bound¬ 
ary, the PDE is called a Dirichlet problem; if the normal derivative of the solu¬ 
tion is prescribed on the boundary, the PDE is called a Neumann problem. 
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When the initial temperature is prescribed for the one-dimensional or 
three-dimensional heat equation (with spherical or cylindrical symmetry), it 
becomes a weight function of the solution in terms of an integral over the 
generic Gaussian solution. The three-dimensional heat equation, with spheri¬ 
cal or cylindrical boundary conditions, is solved by separation of the variables 
leading to eigenfunctions in each separated variable and eigenvalues as sepa¬ 
ration constants. For finite boundary intervals in each spatial coordinate, the 
sum over separation constants leads to a Fourier series solution, whereas in¬ 
finite boundary conditions lead to a Fourier integral solution. The separation 
of variables method attempts to solve a PDE by writing the solution as a prod¬ 
uct of functions of one variable each. General conditions for the separation 
method to work are provided by the symmetry properties of the PDE to which 
continuous group theory applies. 


EXERCISES 


16.2.1 By letting the operator V 2 + k 2 act on the general form a\ \!/\ (x, y, z) + 
Oz 2 Og V, z), show that it is linear; that is, (V 2 + k l )(o\ i'\ + Uz'l'z) = 
ai(V 2 + fc 2 ) Vo + ci2(V 2 + k 2 )^ 2 - 

16.2.2 Show that the Helmholtz equation 

V 2 i {r+k 2 ^ = 0 


is separable in circular cylindrical coordinates if k 2 is generalized to 
fc 2 + /(p) + (l/p 2 M<p) + h{z). 

16.2.3 Separate variables in the Helmholtz equation in spherical polar coor¬ 
dinates, splitting off the radial dependence first. Show that your sepa¬ 
rated equations have the same form as Eqs. (8.142) and (8.145), whereas 
Eq. (8.146) is modified. Describe in your own words what this exercise 
tells you. 


16.2.4 Verify that 


v~ir(r, e, cp) + 


k 2 + /(r) + + 


1 


r 2 sin“ 6 


Kv) 


1 /f(r, 6,<p) = 0 


is separable (in spherical polar coordinates). The functions f g, and h 
are functions only of the variables indicated; k 2 is a constant. 


16.2.5 For a homogeneous spherical solid with constant thermal diffusivity, 
K, and no heat sources, the equation of heat conduction becomes 


3 T(r, t ) 
ft 


= KV 2 T(r,t). 


Assume a solution of the form 


T = R(r)T(t) 
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and separate variables. Show that the radial equation may take on the 
standard form 

<,d 2 R „ dR r o o , ^ „ 

7 "—- + 2 r -h [a r — n(n+ l)]f? =0; n = integer. 

dr 2 dr 

The solutions of this equation are called spherical Bessel functions. 

16.2.6 Separate variables in the thermal diffusion equation of Eq. (8.112) in cir¬ 
cular cylindrical coordinates. Assume that you can neglect end effects 
and take i jr = Tip, t). 


16.3 Inhomogeneous PDE—Green’s Function 


The series substitution of Section 8.5 and the Wronskian double integral of Sec¬ 
tion 8.6 provide the most general solution of the homogeneous, linear, second- 
order ODE. The specific solution, y p , is linearly dependent on the source term 
[Fix') of Eq. (8.44)] and may be cranked out by the variation of parameters 
method. In this section, we discuss a different method of solution for PDEs— 
Green’s functions. 

For a brief introduction to the Green’s function method as applied to the 
solution of a nonhomogeneous PDE, it is helpful to use the electrostatic analog. 
In the presence of charges, the electrostatic potential i j/ satisfies Poisson’s 
nonhomogeneous equation (compare Section 1.13) 

V 2 i/f = — — (mks units) (16.35) 

£o 

and Laplace’s homogeneous equation, 

VV = 0, (16.36) 


in the absence of electric charge (p = 0). If the charges are point charges (j; 
located at r, ; , we know that the solution is 


Vf(r) = 


1 

4jt£o 


E 


Qi 

l r — r il ’ 


(16.37) 


a superposition of single-point charge solutions obtained from Coulomb’s law 
for the force between two point charges (p and (p a distance r apart, 


Jh02X_ 

AtceqT 2 


(16.38) 


By replacement of the discrete point charges with a smeared out distributed 
charge, charge density p [Eq. (16.37)] becomes 


\jfir — 0 ) = 




(16.39) 


or, for the potential at r away from the origin and the charge at r-z, 


1 f Pfo) , 

4jt£ 0 J | r — r 2 1 l2 ' 


H r ) = 


(16.40) 
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We use i/r as the potential corresponding to the given distribution of charge 
and therefore satisfying Poisson’s equation [Eq. (16.35)], whereas a function 
G, which we label Green’s function, is required to satisfy Poisson’s equation 
with a point source at the point defined by r 2 : 

V\G = -a(n - r 2 ). (16.41) 


Physically, then, G is the potential at ri corresponding to a unit source at r 2 . 
By Green’s theorem (Section 1.10), 

J 0AV 2 G - GVV)dr 2 = J (i/fVG - GVi/0 • der 2 . (16.42) 

Assuming that the integrand falls off faster than r ~ 2 , we may simplify our prob¬ 
lem by taking the volume so large that the surface integral vanishes, leaving 

J irV 2 Gdr 2 = J GV 2 ^dr 2 , (16.43) 

or by substituting in Eqs. (16.35) and (16.41), we have 

- J H r 2)K r i - r 2 )dr 2 = - J G( ' V] ’ r ^ >P ^ dr 2 . (16.44) 

Integration by employing the defining property of the Dirac delta function 
[Eq. (1.151)] produces 

f(r) = - f <?(r, r 2 )p(r 2 )dr 2 . (16.45) 

£0 J 

Note that we have used Eq. (16.41) to eliminate V 2 G, but that the function 
G is still unknown. In Section 1.14, Gauss’s law, we found that 



(16.46) 


0 if the volume did not include the origin and — 47T if the origin was included. 
This result from Section 1.14 may be rewritten as in Eq. (1.148), or 

(ib) = - i(r) or v 2 (sb) = - S(r ‘- rA (l6A7) 


corresponding to a shift of the electrostatic charge from the origin to the 
position r = r 2 . Here, r 12 = |rj — r 2 ], and the Dirac delta function <5(r! — r 2 ) 
vanishes unless ri = r 2 . Therefore, in a comparison of Eqs. (16.47) and (16.41), 
the function G (Green’s function) is given by 

GOT, r 2 ) = -—■— --. (16.48) 

47r|ri — r 2 ] 

The solution of our differential equation (Poisson’s equation) is 

1 f P(r 2 ) 

47re 0 J |ri — r 2 | t2 ’ 


Vf(ri) = 


(16.49) 
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SUMMARY 


EXAMPLE 16.3.1 


in complete agreement with Eq. (16.40). Actually, \j/ (r) [Eq. (16.49)] is the 
particular solution of Poisson’s equation. We may add solutions of Laplace’s 
equation [compare Eq. (8.113)]. Such solutions could describe an external field. 

These results can be generalized to the second-order linear but inhomoge¬ 
neous, differential equation 

CyQ n) = -/(ri), (16.50) 

where C is a linear differential operator. The Green’s function is taken to be a 
solution of 


CG( n, r 2 ) = -5(n - r 2 ) (16.51) 

[analogous to Eq. (16.41)]. Then the particular solution y(r ^) becomes 

2 /(n) = f G(n, r 2 )/(r 2 )dr 2> (16.52) 

which can be verified by applying C to y{ r). (There may also be an integral 
over a bounding surface depending on the conditions specified.) 

In summary, the Green’s function, often written (i(r\ , r 2 ) as a reminder of the 
name, is a solution of Eq. (16.41) or Eq. (16.51) more generally. It enters in 
an integral solution of our differential equation, as in Eq. (16.45). For the sim¬ 
ple, but important, electrostatic case, we obtain Green’s function G(r 1; r 2 ) by 
Gauss’s law, comparing Eqs. (16.41) and (16.47). Finally, from the final solution 
[Eq. (16.49)] it is possible to develop a physical interpretation of Green’s func¬ 
tion. It occurs as a weighting function or propagator function that enhances 
or reduces the effect of the charge element p(r 2 )dr 2 according to its distance 
from the field point ri. The Green’s function, G(r\ , r 2 ), gives the effect of a unit 
point source at r 2 in producing a potential at r^ This is how it was introduced 
in Eq. (16.41); this is how it appears in Eq. (16.49). 


Quantum Mechanical Scattering—Neumann Series Solution The quan¬ 
tum theory of scattering provides a good illustration of integral equation tech¬ 
niques and an application of a Green’s function. Our physical picture of scatter¬ 
ing is as follows. Abeam of particles moves along the negative 2 -axis toward the 
origin. A small fraction of the particles are scattered by the potential V (r) and 
go off as an outgoing spherical wave, as shown schematically in Fig. 16.1. Our 
wave function \fr (r) must satisfy the time-independent Schrodinger equation 

n 2 

_ VV(r) + E(r)iKr) = £>0) (16.53a) 

2m 


or 

V 2 ^^) + kr%[r(r ) 


2 TO 

w 


V (r)i/f(r) , 


k 2 


2 mE 

IT 


(16.53b) 


From the physical picture just presented, we search for a solution having an 
asymptotic form 


Vr(r) ~ e ik »' r + fk(0, yy-—, (16.54) 

V 
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Figure 16.1 

Incident Plane Wave 
Scattered by a Potential 
V into an Outgoing 
Spherical Wave 



which we shall derive from the integral equation following from the 
Schrodinger equation. Here, e ik ° r is the incident plane wave, 2 with ko the prop¬ 
agation vector carrying the subscript 0 to indicate that it is in the 0 = 0 ( 2 -axis) 
direction. The magnitudes k 0 and k are equal (for elastic scattering ignoring 
recoil of the target), and e lkr /r is the outgoing spherical wave with an angu¬ 
lar (and energy)-dependent amplitude factor fk(0, (p):’' In quantum mechanics 
texts, it is shown that the differential probability of scattering, da/dQ, the 
scattering cross section per unit solid angle, is given by | f k (6, <p)\ 2 . 

Identifying [—(2m/7i 2 )V(r)i/r(r)] with /(r) of Eq. (16.50), we have 


2m C 

J 


V'(r) = --7T / V(r 2 )iA(r 2 )G(r, r 2 )(Tr 2 


(16.55) 


by Eq. (16.52). This does not have the desired asymptotic form of Eq. (16.54), 
but we may add e lk ° r to Eq. (16.55), a solution of the homogeneous equation, 
and put i/r (r) into the desired form: 


2 C 

= e ,k °‘ r ~ J V (r 2 )V/(r 2 )G(r, r 2 )d 3 r 2 . 


(16.56) 


Our Green’s function is the inverse of the operator C — V 2 + k 2 [Eq. (16.53)] 
satisfying the boundary condition that it describe an outgoing wave. Then, 
from Example 16.3.2, 


<?(ri, r 2 ) = 


exp(i/c|ri - r 2 |) 
4 tt |ri - r 2 | 


and 


i/r(r) = e 


— o ik 0' r 


2m 

47th 2 


J 


V (r 2 )i/f(r 2 ) 


& ik |r-r 2 | 

|r-r 2 | 


;d r 2 . 


(16.57) 


2 For simplicity, we assume a continuous incident beam. In a more sophisticated and more realistic 
treatment, Eq. (16.54) would be one component of a Fourier wave packet. 

3 If V (r) represents a central force, fk will be a function of 9 only, independent of azimuth. 
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This integral equation analog of the original Schrodinger wave equation is 
exact. 

Now to prove the asymptotic relation [Eq. (16.54)], we letr —»■ oo, while the 
integration variable r 2 in Eq. (16.57) is restricted to the small target volume. We 
approximate the denominator by |r — r 2 1 ~ r, but in the oscillatory numerator 
of absolute value unity we have to be more precise. Here, we expand 

|r - r 2 | 2 = (r - r 2 ) 2 = r 2 - 2r • r 2 + x\ ~ r 2 A - ^r • r 2 


so that 


|r — r 2 | ~ r — r ■ r 2 , r -> oo. 

Substituting these asymptotic expressions into Eq. (16.57) yields 

2 'VVY pfaT n ^ 

(Hr) ~ e ik °' r - - - / V (r 2 )i/f (r 2 )e _ifcr ' r2 d 3 r 2 , (16.58) 

Anh z r ] 

which is the asymptotic relation [Eq. (16.54)] with 

9w) r 

Me ,«») = -^2 J v te)Mr 2 )e~ ikt r *d 3 r 2 , (16.59) 

where r = (r, 9, <p ) for large r is the location of the detector that measures the 
scattered wave so that 9 is the scattering angle. 

Employing an iterative (Neumann) series technique (remember that the 
scattering probability is very small), we have 


iAo(r) = e iko ' r , (16.60a) 

which has the physical interpretation of no scattering. 

Substituting i//o(r 2 ) = e ik » r - into the integral, we obtain the first correction 
term 


2 m f gifc|r—r 2 | 

Vq(r) = e lk °' r -„ / V(r 2 ) -e lk °' r2 d 3 r 2 . (16.60b) 

Anh“ J |r-r 2 | 

This is the famous Born approximation. It is expected to be most accurate 
for weak potentials and high incident energy. If a more accurate approximation 
is desired, the perturbation series may be continued. 4 Substituting the incident 
plane wave into the scattering amplitude, Eq. (16.59) yields the first-order Born 
approximation 


7YL C 

MO, ?) = j / V(r 2 y (k °- k > r2 d 3 r 2 , (16.61) 

2nh J 


4 This assumes the iteration (Neumann) series is convergent. In some physical situations, it is not 
convergent and then other techniques are needed. 
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EXAMPLE 16.3.2 


where k = kr is the scattered wave vector, and q = ko — k is the momen¬ 
tum transfer. In other words, the elastic scattering amplitude in first-order 
Born approximation is proportional to the Fourier transform of the potential 
at q. ■ 


Quantum Mechanical Scattering—Green’s Function Again, we consider 
the Schrodinger wave equation [Eq. (16.53b)] for the scattering problem. This 
time we use Fourier transform techniques and we derive the desired form of 
the Green’s function by contour integration. 

We solve the PDE for the Green’s function corresponding to Eq. (16.53b) 

(V 2 + /c 2 )G( r, r 2 ) = —<5(r - r 2 ) = - f (16.62) 

in terms of a Fourier integral 

G(r,r 2 ) = J ( 2 ^y ! f/o(p)e ip(r - r2) (16.63) 

of the same type as the delta function driving term. Substituting the Fourier 
integral into the PDE, we see that 

y e ip(r-r 2 ) _ ? -pgip-(r—r 2 ) 


so that the PDE becomes the trivial algebraic equation (k 2 - p 2 )e/o(p) = — 1 
for the Fourier transform go of G. To find G, we need to evaluate the Fourier 
integral 

/ „ipp cos 0 

d3 P ~( 16 - 64 ) 

where p 2 /(p 2 — k 2 ) is part of the radial integral, and d 3 p = p 2 dp sin OdOdcp. 
Here, pp cos 6 has replaced p (r — r 2 ) for simplicity. With the substitution 
x = cos 6, dx = — sin Odd, the angular part in polar coordinates in momentum 
space is elementary. Integrating over <p by inspection, we pick up a 27r. The 6 
integration then leads to 

p2,7T p7 T p 1 O 

/ dcp e ippcosB sinOdO — 2n / e ippx dx — - — (e ipp — e~ %pp '). (16.65) 

Jo Je =o J -1 WP 


The remaining radial integral is given by 


G(r, r 2 ) = 


— [ 

r 2 pi Jo 


OO P ipp 


o~WP 


-pdp, 


47T 2 pi J 0 p 2 — k 2 

and since the integrand is an even function of p, we may set 

f°° 


G(r, r 2 ) = 


hi 


8jT 2 pi X-oo K 2 


-KdK. 


(16.66) 


(16.67) 


The latter step is taken in anticipation of the evaluation of G*(r, r 2 ) as a contour 
integral. The symbols k and a (a > 0) represent pp and kp , respectively. 
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If the integral in Eq. (16.67) is interpreted as a Riemann integral, the inte¬ 
gral does not exist. This implies that £ _1 does not exist, and in a literal sense 
it does not. £ = V 2 + k 2 is singular since there exist nontrivial solutions i// 
for which the homogeneous equation Cxj/ — 0. We avoid this problem by intro¬ 
ducing a parameter y, defining a different operator C~ l , and taking the limit 
as y -»• 0. 

Splitting the integral into two parts so each part may be written as a suitable 
contour integral gives us 


G(r, r 2 ) = 



Ke lK dK 

K 2 ~(T 2 



ice ik cIk 
k 2 -a 2 ' 


(16.68) 


Contour C\ is closed by a semicircle in the upper half-plane, and C 2 is closed by 
a semicircle in the lower half-plane. These integrals were evaluated in Chap¬ 
ter 7 by using appropriately chosen infinitesimal semicircles to go around the 
singular points k = ±er. As an alternative procedure, let us first displace the 
singular points from the real axis by replacing o by a + iy and then, after 
evaluation, taking the limit as y —>- 0 (Fig. 16.2). 

For y positive, contour C\ encloses the singular point k = a +iy and the 
first integral contributes 


2ni ■ 


IpiO+iy) 

2 


Figure 16.2 

Possible Green’s 
Function Contours 
of Integration 
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From the second integral we also obtain 


2jt i ■ ~e i( - a+iY \ 

2 


the enclosed singularity being k = —{a + iy ). Returning to Eq. (16.68) and 
letting y -> 0, we have the retarded Green’s function that propagates forward 
in time, 


G(r, r 2 ) = 


Anp 


e ik\r-r 2 \ 

4jr|r — r 2 | ’ 


(16.69) 


in full agreement with Exercise 16.3.15. This solution of Helmholtz’s PDE for 
a point source depends on starting with y positive. Had we chosen y negative, 
our Green’s function would have included e la , which corresponds to an in¬ 
coming wave. The choice of positive y is dictated by the boundary conditions 
we wish to satisfy. 

Equations (16.57) and (16.69) reproduce the scattered wave in Eq. (16.53b) 
and constitute an exact solution of the approximate Eq. (16.53b). Exercises 
16.3.17, 16.3.18, and 16.3.19 extend these results. 


EXERCISES 

16.3.1 Verify Eq. (16.42): 

J (d£ 2 m — ri£ 2 r)dr 2 = J p(V 2 rt — mV 2 w) ■ der 2 . 

16.3.2 Show that the terms +k 2 in the Helmholtz operator and — k 2 in the 
modified Helmholtz operator do not affect the behavior of G(ri, r 2 ) in 
the immediate vicinity of the singular point r 2 = r 2 . Specifically, show 
that 

lim [ k z G(ji , r 2 )dr 2 = 0. 

|ri-r 2 |->0 J 

16.3.3 Show that 

exp(i/c|ri - r 2 |) 

4jr|ri — r 2 | 

satisfies the two appropriate criteria and therefore is a Green’s func¬ 
tion for the Helmholtz equation. 

16.3.4 (a) Find the Green’s function for the three-dimensional Helmholtz 

equation, Exercise 16.3.3, when the wave is a standing wave. 

(b) How is this Green’s function related to the spherical Bessel 
functions? 

16.3.5 The homogeneous Helmholtz equation 

V 2 (p + X 2 ip — 0 
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has eigenvalues X 2 and eigenfunctions (pi . Show that the corresponding 
Green’s function that satisfies 

V 2 G(r 1; r 2 ) + X 2 G(r u r 2 ) = -S(ri - r 2 ) 


may be written as 


OO 

G(ri, r 2 ) = Y 

i= 1 


2 


An expansion of this fonn is called a bilinear expansion. If the Green’s 
function is available in closed form, this provides a means of gener¬ 
ating functions (e.g., see Chapter 11). 


16.3.6 An electrostatic potential (mks units) is 


<P(r) = 


Z 

4 ttsq 


e 


—ar 


r 


Reconstruct the electrical charge distribution that will produce this 
potential. Note that <p(r) vanishes exponentially for large r, showing 
that the net charge is zero. 


ANS. 


p(r) = ZS(r ) — 


Za 2 e~ ar 
47r r 


16.3.7 Transform the ODE 

d pp - k 2 y(f) + Vo — y(r) = 0 
dr* r 

and the boundary conditions y(0 ) = y(oc) = 0 into a Fredholm inte¬ 
gral equation of the form 

e~ l 

y(r) = X / G(r, t) — y(f)dt. 

Jo 1 

The quantities V 0 — X and k 2 are constants. The ODE is derived from 
the Schrodinger wave equation with a mesonic potential. 

I le~ kt sinh/cr, 0 <r < t, 

* k 

sinh kt, t < r < oo. 

16.3.8 A charged conducting ring of radius a (Example 11.3.3) may be de¬ 
scribed by 

p(r) = ppS(r - a)S(cosd). 

Zita z 

Using the known Green’s function for this system, And the electrostatic 
potential. 

Hint. Exercise 11.3.3 will be helpful. 








778 


Chapter 16 Partial Differential Equations 


16.3.9 Changing a separation constant from Ar to — k 2 and putting the discon¬ 
tinuity of the first derivative into the ^-dependence, show that 


1 


1 OO /» OO 

£ / girrim -^lj m {k P ,)J m {k P 2 )e- k ^ \ dk. 

l7T «=-* Jo 


47r |ri — 1 * 2 ! 47r 

Hint. Develop <5(pi — p 2 ) front Exercise 15.5.3. 

16.3.10 Derive the expansion 

ji(kri)h { p(kr 2 ), n<r 2 


exp[i/c|ri - r 2 \ 
47r|ri - r 2 | 


= ik J2 


1=0 

i 


ji(kr 2 )h { p{kri), r\ > r 2 

x @2, ^). 


m=—l 

Hint. The left side is a known Green’s function. Assume a spherical 
harmonic expansion and work on the remaining radial dependence. 
The spherical harmonic closure relation covers the angular depen¬ 
dence. 

16.3.11 Show that the modified Helmholtz operator Green’s function 

exp(—fcjri - r 2 |) 

4jr|r! — r 2 | 

has the spherical polar coordinate expansion 


exp(—fc|r! - r 2 |) 
4 tt |ri — r 2 | 


= ikj^i l (kr < yk l (.kr > ) <Pi)Yr(d 2 , cp 2 ). 


1=0 


m=—l 


Note. The modified spherical Bessel functions iiQcr ) and k[ (Ar) are 
defined in Table 16.2. 

16.3.12 A pointlike particle of mass m generates a Yukawa potential V = 
—Ge~ r/k /(mr). Obtain the potential outside a spherical distribution 
of such particles of total mass M and radius R. 

16.3.13 From the spherical Green’s function of Exercise 16.3.10, derive the 
plane wave expansion 

Jk 


' k r = J2 il (~ 21 + m(kr)P,(cos y), 


1=0 


where y is the angle included between k and r. This is the Rayleigh 
equation of Exercise 11.4.7. 

Hint. Take r 2 ri so that 

k n 


IO -r 2 | 


r 2 - r 20 ■ n = r 2 - 


k 


Let v 2 —> oo and cancel a factor of e^ ' /r 2 . 
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16.3.14 From the results of Exercises 16.3.10 and 16.3.13, show that 


OO 




16.3.15 Noting that 



is an eigenfunction of 

(V 2 + /c 2 )Mr) = 0, 

show that the Green’s function of C = V 2 may be expanded as 



16.3.16 Using Fourier transforms, show that the Green’s function satisfying 
the nonhomogeneous Helmholtz equation 


(V 2 + ^)G(r 1 ,r 2 )=-5(r 1 -r 2 ) 


is 



16.3.17 The basic equation of the scalar Kirchhoff diffraction theory is 



where (/ satisfies the homogeneous Helmholtz equation and r = 
| iq — r 2 |. Derive this equation. Assume that ri is interior to the closed 
surface ,S 2 . 

Hint. Use Green’s theorem. 

16.3.18 The Bom approximation for the scattered wave is given by Eq. (16.54) 
[and Eq. (16.61)]. From the asymptotic form [Eq. (16.58)], 



For a scattering potential V (r 2 ) that is independent of angles and for 
r jv, show that 



Here, k 0 is in the 0 = 0 (original .s-axis) direction, whereas k is in the 
(0, <p) direction. The magnitudes are equal: |k 0 1 = |k|; mis the reduced 


mass. 
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Hint. You have Exercise 16.3.13 to simplify the exponential and Ex¬ 
ample 15.5.3 to transform the three-dimensional Fourier exponential 
transform into a one-dimensional Fourier sine transform. 


16.3.19 Calculate the scattering amplitude fifO, <p) for a mesonic potential 
V(r) = Vo(e~ ar /ar), Vo a constant. 

Hint. This particular potential permits the Born integral, Exercise 
16.3.18, to be evaluated as a Laplace transform. 


ANS. 


MO, <p) = — 


2mV 0 1 

h?a a 2 + (ko — k ) 2 


16.3.20 The mesonic potential V(r) — Vo (e _QT /ar) may be used to describe 
the Coulomb scattering of two charges q\ and q- 2 . We let a -> 0 and 
Vo —> 0 but take the ratio Vo /a to be (hq-i/Anso. (For Gaussian units, 
omit the 4;reo.) Show that the differential scattering cross section 
da/dQ, = \f k (6, cp | 2 is given by 

da _ ( qiq 2 \ 2 1 £ _ p 2 _ h 2 k 2 

dQ. \Aneo) 16-E 12 sin 4 (0/2)’ 2m 2m' 

It happens (coincidentally) that this Born approximation is in exact 
agreement with both the exact quantum mechanical calculations and 
the classical Rutherford calculation. 


Additional Reading 


Bateman, H. (1944). Partial Differential Equations of Mathematical Physics. 
Dover, New York. A wealth of applications of various PDEs in classical 
physics. Excellent examples of the use of different coordinate systems— 
ellipsoidal, paraboloidal, toroidal coordinates, and so on. 

Cohen, H. (1992). Mathematics for Scientists and Engineers. Prentice-Hall, 
Englewood Cliffs, NJ. 

Courant, R., and Hilbert, D. (1953). Methods of Mathematical Physics, Vol. 1 
(English edition). Interscience, New York. Reprinted, Wiley, New York 
(1989). This is one of the classic works of mathematical physics. Originally 
published in German in 1924, the revised English edition is an excellent 
reference for a rigorous treatment of Green’s functions and a wide variety 
of other topics on mathematical physics. 

Davis, P. J., and Rabinowitz, P. (1967). Numerical Integration. Blaisdell, 
Waltham, MA. This book covers a great deal of material in a relatively 
easy-to-read form. Appendix 1 (On the Practical Evaluation of Integrals 
by M. Abramowitz) is excellent as an overall view. 

Garcia, A. L. (1994). Numerical Methods for Physics. Prentice-Hall, Englewood 
Cliffs, NJ. 

Hamming, R.W. (1973). Numerical Methods for Scientists and Engineers, 2nd 
ed. McGraw-Hill, New York. Reprinted, Dover, New York (1987). This well- 
written text discusses a wide variety of numerical methods from zeros 
of functions to the fast Fourier transform. All topics are selected and 
developed with a modern high-speed computer in mind. 






16.3 Inhomogeneous PDE—Green’s Function 


781 


Hubbard, J., and West, B. H. (1995). Differential Equations. Springer, Berlin. 
Margenau, H., and Murphy, G. M. (1956). The Mathematics of Physics and 
Chemistry , 2nd ed. Van Nostrand, Princeton, NJ. Chapter 5 covers curvi¬ 
linear coordinates and 13 specific coordinate systems. 

Morse, P. M., and Feshbach, H. (1953). Methods of Theoretical Physics. 
McGraw-Hill, New York. Chapter 5 includes a description of several dif¬ 
ferent coordinate systems. Note that Morse and Feshbach are not above 
using left-handed coordinate systems even for Cartesian coordinates. Else¬ 
where in this excellent (and difficult) book there are many examples of 
the use of the various coordinate systems in solving physical problems. 
Eleven additional fascinating but seldom encountered orthogonal coordi¬ 
nate systems are discussed in the second edition of Mathematical Methods 
for Physicists (1970). Chapter 7 is a particularly detailed, complete discus¬ 
sion of Green’s functions from the point of view of mathematical physics. 
Note, however, that Morse and Feshbach frequently choose a source of 
4jr<5(r — r') in place of our <5(r — r'). Considerable attention is devoted to 
bounded regions. 

Press, W. H., Flannery, B. P, Teukolsky, S. A., and Vetterling, W. T. (1992). 

Numerical Recipes, 2nd ed. Cambridge Univ. Press, Cambridge, UK. 
Ralston, A., and Wilf, H. (Eds.) (1960). Mathematical Methods for Digital Com¬ 
puters. Wiley, New York. 

Ritger, P. D., and Rose, N. J. (1968). Differential Equations with Applications. 
McGraw-Hill, New York. 

Stakgold, I. (1997). Green’s Functions and Boundary Value Problems, 2nd ed. 
Wiley, New York. 

Stoer, J., and Burlirsch, R. (1992). Introduction to Numerical Analysis. 
Springer-Verlag, New York. 




Probability 


Probabilities arise in many problems dealing with random events or large num¬ 
bers of particles defining random variables. An event is called random if it is 
practically impossible to predict from the initial state. This includes those cases 
in which we have merely incomplete information about initial states and/or 
the dynamics, as in statistical mechanics, in which we may know the energy 
of the system that corresponds to very many possible microscopic configura¬ 
tions, preventing us from predicting individual outcomes. Often, the average 
properties of many similar events are predictable, as in quantum theory. This 
is why probability theory can be and has been developed. 

Random variables are involved when data depend on chance, such as 
weather reports or stock prices. The theory of probability describes math¬ 
ematical models of chance processes in terms of probability distributions of 
random variables that describe how some “random events” are more likely 
than others. In this sense, probability is a measure of our ignorance, giving 
quantitative meaning to qualitative statements such as “It will probably rain 
tomorrow” or “I’m unlikely to draw the heart queen.” Probabilities are of fun¬ 
damental importance in quantum mechanics and statistical mechanics and are 
applied in meteorology, economics, games, and many other areas of daily life. 

To a mathematician, probabilities are based on axioms, but we discuss here 
practical ways of calculating probabilities for random events. Because experi¬ 
ments in the sciences are always subject to errors, theories of errors and their 
propagation involve probabilities. In statistics we deal with the applications 
of probability theory to experimental data. 


17.1 Definitions, Simple Properties 


All possible mutually exclusive 1 outcomes of an experiment that is subject 
to chance represent the events or points of the sample space S. For example, 


1 This means that given that one particular event did occur, the others could not have occurr ed. 
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each time we toss a coin we give the trial a number i — 1, 2, , and observe 

the outcomes ay Here, the sample consists of two events, heads or tails, and 
the Xi represent a discrete random variable that takes on two values, heads or 
tails. When two coins are tossed, the sample contains the events two heads, 
one head and one tail, and two tails; the number of heads is a good value to 
assign to the random variable, so the possible values are 2, 1, and 0. There are 
four equally probable outcomes, of which one has value 2, two have value 1, 
and one has value 0. So the probabilities of the three values of the random 
variable are 1/4 for two heads (value 2), 1/4 for no heads (value 0), and 1/2 
for value 1. In other words, we define the theoretical probability P of an event 
denoted by the point Xi of the sample as 


P(Xi) 


number of outcomes of event ay 
total number of all events 


(17.1) 


An experimental definition applies when the total number of events is not well 
defined (or difficult to obtain) or equally likely outcomes do not always occur. 
Then 


P(Xi) = 


number of times event x% occurs 
total number of trials 


(17.2) 


is more appropriate. A large, thoroughly mixed pile of black and white sand 
grains of the same size and in equal proportions is a relevant example because 
it is impractical to count them all. However, we can count the grains in a 
small sample volume that we pick. This way, we can check that white and 
black grains turn up with approximately equal probability 1 /2, provided we 
put back each sample and mix the pile again. It is found that the larger the 
sample volume, the smaller the spread about 1/2 will be. The more trials we 
run, the closer to 1/2 the average probability of all trial counts will be to 1/2. 
We could even pick single grains and check if the probability 1 /4 of picking 
two black grains in a row equals that of two white grains, etc. There are many 
statistics questions we can pursue. Therefore, piles of colored sand provide 
for instructive experiments. 

The following axioms are self-evident: 

• Probabilities satisfy 0 < P < 1. Probability 1 means certainty and probabil¬ 
ity 0 means impossibility. 

• The entire sample has probability 1. For example, drawing an arbitrary card 
has probability 1. 

• The probability for mutually exclusive events add. The probability for get¬ 
ting one head in two coin tosses is 1/2 = 1/4 + 1/4 because it is 1/4 for 
head first and then tail, plus 1 /4 for tail first and then head. 


EXAMPLE 17.1.1 


Probability for A or B What is the probability for drawing 2 a club or a 
jack from a shuffled deck of cards if one draws very often? Because there are 


2 These are examples of non-mutually exclusive events. 
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Figure 17.1 

The Shaded Area 
Gives the 

Intersection A C\B 
Corresponding to 
the A and B Events, 
and the Dashed Line 
Encloses A U B 
Corresponding to A 
or B Events 



52 cards in a deck, each being equally likely, 13 cards for each suit and 4 jacks, 
there are 13 clubs, including the club jack and 3 other jacks; that is, there 
are 16 possible cards out of 52, giving the probability (13 + 3)/52 = 16/52 = 
4/13. ■ 

If we represent the sample space by a set, S, of points, then events are 
subsets A, B, ... of S denoted as /I c S, etc. Two sets A, B are equal if 
A is contained in B, A c B, and B is contained in A, B c A. The union 
A U B consists of all points (events) that are in A or B or both (Fig. 17.1). The 
intersection A C\ B consists of all points that are in both A and B. If A and B 
have no common points, their intersection is the empty set, An B = 0, which 
has no elements (events). The set of points in A that are not in the intersection 
of A and B is denoted by A — A n B, defining a subtraction of sets. If we 
take the club suit in Example 17.1.1 as set A, while the four jacks are set B, 
then their union comprises all clubs and jacks, whereas their intersection is 
the club jack only. 

Each subset A has its probability P(A) > 0. In terms of these set theory 
concepts and notations, the probability laws we just discussed become 

0 < P(A) < 1. 

The entire sample space has P(S) = 1. The probability of the union /I U B of 
mutually exclusive events is the sum 

P(AU P) = P(A) + P(P), AnB= 0. 

The addition rule for probabilities of arbitrary sets is given by the theorem 

P(A U P) = P(A) + P(P) - P(A D B). (17.3) 

To prove this, we decompose the union into two mutually exclusive sets 
A U P = A U (B — P n A), subtracting the intersection of A and B from 
B before joining them. Their probabilities are P(A), P(P) — P(B fi A), which 
we add. We could also have decomposed A U B = (A — An P) U P, from 
which our theorem follows similarly by adding these probabilities P(A U B) = 
[P(A) - P(A n P)] + P(P). Note that A n B = B n A (Fig. 17.1). 
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EXAMPLE 17.1.2 


EXAMPLE 17.1.3 


Sometimes the rules and definitions of probabilities that we have discussed 
so far are not sufficient, however. 

Conditional Probability A simple example consists of a box of 10 identical 
red and 20 identical blue pens arranged in random order from which we remove 
pens successively, that is, without putting them back. Suppose we draw a red 
pen first, event A. That will happen with probability P(A) = 10/30 = 1/3 if 
the pens are thoroughly mixed up. The probability of drawing a blue pen in 
the next round, event B, however, will depend on the fact that we drew a red 
pen in the first round. It is given by P{B\ A) = 20/29. There are 10 • 20 possible 
sample points in two rounds, and the sample has 30 ■ 29 events, so the combined 
probability is P(A n E) = |g = = p. 

In general, the probability P(A n B) that A and B happen is given by the 
product of the probability that A happens, P(A), and the probability that B 
happens if A does, P(B\A): 

P(A n E) = P(A)P(B\A). (17.4) 


In other words, the conditional probability P(B\ A) is given by the ratio 


P(B\A) = 


P(A n B ) 


(17.5) 


If the conditional probability P(B\A) = P(B) is independent of A, then the 
events A and B are called independent, and the combined probability 


P(A n E) = P(A)P(B) 


(17.6) 


is simply the product of both probabilities. 

SAT Tests Colleges and universities rely on the verbal and mathematics 
SAT test scores, among others, as predictors of a student’s success in passing 
courses and graduating. A research university is known to admit mostly stu¬ 
dents with a combined verbal and mathematics score of more than 1400 points. 
The graduation rate is 95%; that is, 5% drop out or transfer elsewhere. Of those 
who graduate, 97% have an SAT score of more than 1400 points, whereas 80% 
of those who drop out have an SAT score below 1400. Suppose a student has 
an SAT score below 1400; what is his/her probability of graduating? 

Let A be the cases having an SAT test score below 1400, B represent those 
with scores above 1400, mutually exclusive events with P(A) + P(B ) = 1, 
and C those students who graduate. That is, we want to know the conditional 
probabilities P(C\A) and P(C\B). To apply Eq. (17.5) we need P(A) and P(E). 
There are 3% of students with scores below 1400 among those who graduate 
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(95%) and 80% of the 5% who do not graduate, so 


P(A) = 0.03 • 0.95 + -0.05 = 0.0685, 
5 


and also 


P(P) = 0.97 ■ 0.95 + 0.05/5 = 0.9315, 


P(C n A) = 0.03 • 0.95 = 0.0285 and P(C n B) = 0.97 • 0.95 = 0.9215. 


Therefore, 


P(C |A) = 
P(C\B ) = 


P(C n A) 
P(A) 
P(C n E) 
P(B) 


0.0285 

0.0685 

0.9215 

0.9315 


41.6%, 


98.9%, 


that is, slightly less than 42% is the probability for a student with a score below 
1400 to graduate at this particular university. ■ 


As a corollary to the definition [Eq. (17.5)] of a conditional probability, we 
compare P(A|P) = P(A n E)/P(B) and P(P| A) = P(A fl E)/ P(B), which 
leads to Bayes theorem 

P(A\B) = ^P(P|A). (17.7) 

This can be generalized to the theorem: If the random events A* with proba¬ 
bilities P(Aj) > 0 are mutually exclusive and their union represents the entire 
sample S, then an arbitrary random event B c S has the probability 

n 

P(P) = ^]P(A i )P(P|A i ). (17.8) 

i= 1 

This decomposition law resembles the expansion of a vector into a basis 
of unit vectors defining the components of the vector. This relation follows 
from the obvious decomposition B = U, (P n A/) (Fig. 17.2), which implies 
P(/j) = P(B n A,) for the probabilities because the components B fl A,; 
are mutually exclusive. For each i we know from Eq. (17.5) that P(B n A*) = 
P(Aj)P(P| A*), which proves the theorem. 

Figure 17.2 

The Shaded Area B 
Is Composed of 
Mutually Exclusive 
Subsets of B 
Belonging also to 
Ai, A 2 , A 3 , where 
A i Are Mutually 
Exclusive 
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Counting of Permutations and Combinations 


Counting particles in samples can help us find probabilities, as in statistical 
mechanics. 

If we have to different molecules, let us ask in how many ways we can 
arrange them in a row, that is, permute them. This number is defined as the 
number of their permutations. Thus, by definition, the order matters in 
permutations. There are to choices of picking the first molecule, to — 1 for the 
second, etc. Altogether, there are to! permutations of to different molecules or 
objects. 

Generalizing this, suppose there are to people but only k < n chairs to 
seat them. In how many ways can we seat k people in the chairs? Counting as 
before, we get 


to(to — 1) • • • (to — k + 1) = 


to ! 

(to — k)\ 


for the number of permutations of to different objects, k at a time. 

We now consider the number of combinations of objects when their order 
is irrelevant by definition. For example, three letters a,b,c can be combined, 
two letters at a time, in 3 = fr ways: ab, ac, be. If letters can be repeated, then 
we add the pairs aa, bb, cc, and have six combinations. Thus, a combina¬ 
tion of different particles differs from a permutation in that their order does 
not matter. Combinations occur with repetition (the mathematician’s way of 
treating indistinguishable objects) and without, where no two sets contain the 
same particles. 

The number of different combinations of TOparticles, k at a time and without 
repetitions, is given by the binomial coefficient 


to(to — 1) • • ■ (to — k + 1) 
k! 

If repetition is allowed, then the number is 


/ to + k — 1 \ 

V k )■ 



In the number to!/(to— A:)! of permutations of TOparticles, k at a time, we have to 
divide out the number k\ of permutations of the groups of k particles because 
their order does not matter in a combination. This proves the first claim. The 
second one is shown by mathematical induction. 

In statistical mechanics, we ask in how many ways we can put to particles 
in k boxes so that there will be to, (alike) particles in the 7th box, without 
regard to order in each box, with J2i=i n > = n - Counting as before, there 
are to choices for selecting the first particle, to — 1 for picking the second, 
etc., but the TOi! permutations within the first box are discounted, and TO 2 ! 
disregarded permutations within the second box, etc. Therefore, the number 
of combinations is 

to ! 


TOi!to 2 ! ■■ - n k \ 


ni+n 2 -\ - Yn k =n. 
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In statistical mechanics, particles that obey 

• Maxwell-Boltzmann (MB) statistics are distinguishable without restriction 
on their number in each state; 

• Bose-Einstein (BE) statistics are indistinguishable with no restriction on 
the number of particles in each quantum state; and 

• Fermi-Dirac (FD) statistics are indistinguishable with at most one particle 
per state. 

For example, putting three particles in four boxes, there are 4 3 equally likely 
arrangements for the MB case because each particle can be put into any box 
in four ways, giving a total of 4 3 choices. For BE statistics, the number of 
combinations with repetitions is ( :3+3 ~ 1 ) = (®). For FD statistics, it is ( :! l , 1 ) = 
( 3 ). More generally for MB statistics, the number of distinct arrangements of n 
particles among k states (boxes) is k", for BE statistics it is (" ‘ 3 1 ) with k > n, 
and for FD statistics it is (*). 


EXERCISES 

17.1.1 A card is drawn from a shuffled deck, (a) What is the probability that 
it is black, (b) a red nine, (c) or a queen of spades? 

17.1.2 Find the probability of drawing two kings from a shuffled deck of cards 
(a) if the first card is put back before the second is drawn, and (b) if 
the first card is not put back after being drawn. 

17.1.3 When two fair dice are thrown, what is the probability of (a) observing 
a number less than 4 or (b) a number greater than or equal to 4, but 
less than 6? 

17.1.4 Rolling three fair dice, what is the probability of obtaining six points? 

17.1.5 Determine the probability P(A n B fi C) in terms of P(A~), P(B), P(C ), 
etc. 

17.1.6 Determine directly or by mathematical induction the probability of a 
distribution of N (Maxwell-Boltzmann) particles in k boxes with A) 
in box 1, Ab in box 2, , A)- in the A:1 h box for any numbers Nj > 1 

with A) + N 2 H-f Nk = N, k < N. Repeat this for Fermi-Dirac and 

Bose-Einstein particles. 

17.1.7 Show that P(A UJ5UC) = P(A) + P(E) + P(C) - P(A n E)- 
P(A n C) - P(B n C) + P(A nfinc). 

17.1.8 Determine the probability that a positive integer n < 100 is divisible by 
a prime number p < 100. Verify your result for p — 3, 5, 7. 

17.1.9 Put two particles obeying Maxwell-Boltzmann (Fermi-Dirac, or Bose- 
Einstein) statistics in three boxes. How many ways are there in each 
case? 
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Each time we toss a die, we give the trial a number i= 1,2,... and observe the 
point Xi — 1 or 2, 3, 4, 5, 6 with probability 1/6. If i denotes the trial number, 
then Xi is a discrete random variable that takes the discrete values from 1 to 6 
with a definite probability P(Xi ) =1/6 over many trials. 


EXAMPLE 17.2.1 


Discrete Random Variable If we toss two dice and record the sum of the 
points shown in each trial, then this sum is also a discrete random variable that 
takes on the value 2 when both dice show 1 with probability (1/6) 2 ; the value 
3 when one die has 1 and the other 2 hence with probability (1/6) 2 + (1/6) 2 = 
1/18; the value 4 when both dice have 2 or one has 1 and the other 3, hence 
with probability (1/6) 2 + (1/6) 2 + (1/6) 2 = 1/12; the value 5 with probability 
4(1 /6) 2 = 1 /9; the value 6 with probability 5/36; the value 7 with the maximum 
probability 6(l/6) 2 = 1/6; up to the value 12 when both dice show 6 points 
with probability (1/6) 2 . This probability distribution is symmetric about 7. This 
symmetry is obvious from Fig. 17.3 and becomes visible algebraically when 
we write the rising and falling linear parts as 


P(pc) = 


x — 1 
36 


6 - (7 — x) 
36 


P(x) = 


13 — x 
36 


6 + (7 — x) 
36 


x — 2, 3, ..., 7, 
x = 7, 8,..., 12. ■ 


SUMMARY 


In summary, the different values X; that a random variable X assumes de¬ 
note and distinguish the events in the sample space of an experiment; each 
event occurs by chance with a probability P(X = x/) = p, > 0 that is a 
function of the random variable X. A random variable X (e/) = x,- is defined on 
the sample space, that is, for the events e-, e S. 


Figure 17.3 

Probability Distribution 
P(x) of the Sum of 
Points when Two Dice 
Are Tossed 
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We define the probability density f(x) of a continuous random variable X 
as 


P(x < X < x + dx) — f(x)dx, (17.9) 


that is, /( x)dx is the probability that X lies in the interval x < X < x+ dx. For 
/Or) to be a probability density, it has to satisfy /Or) > 0 and / f(x)dx = 1. 
The generalization to probability distributions depending on several random 
variables is straightforward. Quantum physics abounds in examples. 


EXAMPLE 17.2.2 


Continuous Random Variable: Hydrogen Atom Quantum mechanics 
gives the probability \x[r\ 2 d 3 r of finding a Is electron in a hydrogen atom in 
volume 3 d 3 r, where / = Ne~ r/a is the wave function that is normalized to 


1 = 


I |i/r| 2 dV — AnN 2 J e 2r/a r 2 dr = na 3 N 2 , dV = r 2 drd cos 9 d(p 


being the volume element and a the Bohr radius. The radial integral is found 
by repeated integration by parts or by rescaling it to the gamma function 


£ 


e 


-2r/a 2 j _ 


r dr = I - 


> /»oo 

/ e~ x a? dx = a 3 r(3)/8 = a 3 /4. 
Jo 


Here, all points in space constitute the sample and represent three random 
variables, but the probability density |/) 2 in this case depends only on the 
radial variable because of the spherical symmetry of the Is state. 

A measure for the size of the hydrogen atom is given by the average radial 
distance of the electron from the proton at the center, which is called the 
expectation value 


-J 


(ls|r|ls> = / r\\l/\ 2 dV = 4nN‘ 


f 


re 


-2r/aJ2 


r dr — -a 


in quantum mechanics. We shall define this concept for arbitrary probability 
distributions later. ■ 


• A random variable that takes only discrete values X\, x-i , ..., x„ with prob¬ 
abilities pi, p 2 ,..., p n , respectively, is called a discrete random variable so 
that Pi — I - If an “experiment” or trial is performed, some outcome must 
occur, with unit probability. 

• If the values comprise a continuous range of values a < x <b, then we deal 
with a continuous random variable whose probability distribution may or 
may not be a continuous function as well. 


3 Note that | -ty\ 2 4xr 2 dr gives the probability for the electron to be found between r and r + dr, at 
any angle. 
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When we measure a quantity x n times, obtaining the values Xj, we define 
the average value 


1 n 

x^-^xj (17.10) 

3 =1 


of the trials, also called the mean or expectation value, where this formula 
assumes that every observed value x-i is equally likely and occurs with proba¬ 
bility 1 /n. This connection is the key link of experimental data with probability 
theory. This observation and practical experience suggest defining the mean 
value for a discrete random variable X as 


(X) = J2 X iPi ( 17 - 11 ) 

i 

and for a continuous random variable as 

(. X) = J xf(x)dx. (17.12) 

These are linear averages. Other notations in the literature are X or E(X). 

The use of the arithmetic mean x of n measurements as the average value 
is suggested by simplicity and plain experience, assuming equal probability 
for each Xi again. Wiry do we not consider the geometric mean 


X g = (Xf X 2 • • • Xn) 1/n (17.13) 

or the harmonic mean Xh determined by the relation 

+ —) (17.14) 

X n ) 

or that value x which minimizes the sum of absolute deviations \x,j — x\l Here, 
t he Xj are taken to increase monotonically. When we plot O (x) = Yl fl ]' \x r —x\ 
as in Fig. 17.4(a) for an odd number of points, we realize that it has a minimum 


1 1/1 1 

— _ - I-1-1- 

Xh n Xi 


Figure 17.4 

(a) £,=i \xi -x\ 
for an Odd Number 
of Points; (b) 

Eh \ x i ~ *\ f or 

an Even Number of 
Points 
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at its central value i = n, whereas for an even number of points E(x) = 
\ x i ~ x \ is flat i R its central region, as shown in Fig. 17.4(b). These prop¬ 
erties make these functions unacceptable for determining average values. In¬ 
stead, when we minimize the sum of quadratic deviations 

n 

— .x ,;) 2 = minimum, (17.15) 

i= 1 


setting the derivative equal to zero, yields 2 J^ ; (x — x t ) = 0 or 


x 


I 

n 


^2 x i = x; 

i 


that is, the arithmetic mean. It has another important property: If we denote 
the deviations by Vi = Xi — x, then 14 = 0 ; that is, the sum of positive 
deviations equals the sum of negative deviations. This principle of minimizing 
the quadratic sum of deviations is called the method of least squares and is 
due to C. F. Gauss, among others. 

How close a fit of the mean value is to a set of data points depends on the 
spread of the individual measurements from this mean. Again, we reject the 
average sum of deviations Y^i=i \ x % — x\/nas a measure of the spread because 
it selects the central measurement as the best value for no good reason. A 
more appropriate definition of the spread is the average of the deviations from 
the mean, squared, or standard deviation 


N 


- yYa* - X) 2 , 

n ^ 


i =1 


where the square root is motivated by dimensional analysis. 


EXAMPLE 17.2.3 


Standard Deviation of Measurements From the measurements x\ — 7, 
X 2 = 9, X 3 = 10, X 4 = 11, .Xr, = 13, we extract x = 10 for the mean value 
and a — V(9 + 1 + 1 + 9)/4 = 2.2361 for the standard deviation or spread, 
using the experimental formula [Eq. (17.2)] because the probabilities are not 
known. ■ 


There is another interpretation of the variance in terms of the sum of 
squares of measurement differences 


j n n 

J 2 <i x i - Xkf = - ^2 y] (x\+xi - 2 Xix k ) 

i<k i= 1 k= 1 
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because by multiplying out the square in the definition of a 2 , we obtain 


a 


2 


1 

n 


Vi - {x)f 


-x* 


2 (a) 

n 


^ Xi + {x ) 2 

i 


= - y'' x\ — {x ) 2 = {x 2 ) - {x) 2 . 
n 


(17.17) 


This formula is often used and widely applied for a 2 . 

Now we are ready to generalize the spread in a set of n measurements with 
equal probability 1 /n to the variance of an arbitrary probability distribution. 
For a discrete random variable X with probabilities p, at X = .ay we define the 

variance 


o 2 = Y J tx j -{X)f Pj , (17.18) 

3 

and similarly for a continuous probability distribution 

/ oo 

(_x - (Xyff(x)dx. (17.19) 

-oo 

These definitions imply the following theorem: If a random variable Y — 
aX + b is linearly related to X, then we can immediately derive the mean value 
(Y) — a(X) + b and variance er 2 (F) = a 2 o 2 (X) from these definitions. 

We prove this theorem only for a continuous distribution and leave the case 
of a discrete random variable as an exercise for the reader. For the infinitesimal 
probability we know that / (x)dx = g(jj)dy , with y = ax+b, because the linear 
transformation has to preserve probability, so 

/ OO POO 

yg(y)dy = (ax + b)f(x)dx = a(X) + b 

-OO J — OO 

since / f(x)d,x = 1. For the variance we similarly obtain 

/ OO POO 

(2/ - (■ Y)fg(y)dy = (ax + b- a(X) - bff(x)dx 

-oo J —OO 

= a 2 cr 2 (X) 


after substituting our result for the mean value (Y). 

Finally, we prove the general Chebyshev inequality 

P(\x-(X)\ >/nr)< i (17.20) 

which demonstrates why the standard deviation serves as a measure of the 
spread of an arbitrary probability distribution from its mean value (X) and 
shows why experimental or other data are often characterized according to 
their spread in numbers of standard deviations. We first show the simpler 
inequality 


(Y) 

P(Y > K) < XL 
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for a continuous random variable Y whose values y > 0. (The proof for a 
discrete random variable follows along similar lines.) This inequality follows 
from 


(Y) = 


poo p K p cx 

= / yf(y)dy= / yf{y)dy + 

Jo Jo Jk 


yf(.y)dy 


r»oo 


p OO /» c 

> / 2/Z(2/)d2/ > K / = KP(Y > if). 

Next, we apply the same method to the positive variance integral 

>-l 


a 2 = f{x — {X)) 2 f{x)dx > / (a:— (X)Yf{x)dx 

J J \x—(X)\>kcr 


> /c 2 <r 2 f f(x~)dx = k 2 o 2 P(\x— (X)| > fccr), 

|ir—(X)|>A:(7 

decreasing the right-hand side first by omitting the part of the positive integral 
with \x— (X) | < ka, then again by replacing {x— (X)'f in the remaining integral 
by its lowest limit k 2 a 2 . This proves the Chebyshev inequality. For k = 3 we 
have the conventional three standard deviation estimate 


P{\x-(X)\ >3a)< -. 


(17.21) 


It is straightforward to generalize the mean value to higher moments of 
probability distributions relative to the mean value {X): 

({X — (X) t) — y ^{Xj — (Xyfpj, discrete distribution, (17.22) 


-L 


({X-(X)f) = {x— (X)) f{x) dx, continuous distribution. 


The moment generating function 


-f 


{e ) = j e tx f{x)dx = 1 + t(X) + -(X 2 ) + 


(17.23) 


is a weighted sum of the moments of the continuous random variable X 
upon substituting the Taylor expansion of the exponential function. There¬ 
fore, (X) = ^j~\ t =o- Notice that the moments here are not relative to the ex¬ 
pectation value; they are called central moments. The rath central moment 
(X") = ' | i-o is given by the nth derivative of the moment generating func¬ 

tion at t = 0. By a change of the parameter t -> it, the moment generating 
function is related to the characteristic function (e ltx ), which is the often 
used Fourier transform of the probability density fix'). 

Moreover, mean values, moments, and variance can be defined similarly 
for probability distributions that depend on several random variables. For 
simplicity, let us restrict our attention to two continuous random variables 
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X, Y and list the corresponding quantities 


(X) 



xf(x, y)dxdy, 


(Y) 



yf(x, y)dxdy, 


a\X) 



{x- {X)ff{x, y)dxdy, 


a\Y) 



{y- {Y)ff{x , y)dxdy. 


(17.24) 


(17.25) 


Two random variables are said to be independent if the probability 
density f{x, y) factorizes into a product f{x)g{y ) of probability distributions 
of one random variable each. 

The covariance defined as 


cov{X, Y) = {{X - (X)){Y - (F»> (17.26) 


is a measure of how much the random variables X, Y are correlated (or re¬ 
lated): It is zero for independent random variables because 

cov{X, Y) = J(x- {X)){y- {Y))f{x, y)dxdy 

= f{x~ (X))f(x)dx J{y- (Y))g{y)dy 
= {(X) - (X)X(Y) - (K» = 0. 


The normalized covariance that has values between — 1 and +1 is often 

o(X)o(Y) 

called correlation. 

In order to demonstrate that the correlation is bounded by 


1< cov(X,7) <i 

- a{X)o{Y ) - 


we analyze the positive mean value 


Q = ([<Z-(X» + r(F-(F»] 2 ) 

= u 2 {[X- (X)] 2 )+2uv([X- (X>][F- (F>]) + v 2 {[Y — (F)] 2 ) 

= u 2 a{Xf + 2 uv cov(X, F) + v 2 a{Yf > 0, (17.27) 


where u, v are numbers, not functions. For this quadratic form to be non¬ 
negative, its discriminant must obey cov(X, F) 2 — ct(X) 2 ct(F) 2 < 0, which 
proves the desired inequality. 

The usefulness of the correlation as a quantitative measure is emphasized 
by the following theorem: P(Y = aX + b) — 1 is valid if, and only if, the cor¬ 
relation is equal to ±1. This theorem states that a ±100% correlation between 
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X, Y implies not only some functional relation between both random variables 
but also a linear relation between them. We denote by {B\A) the expectation 
value of the conditional probability distribution P(P| A). 

To prove this strong correlation property, we apply Bayes’s decomposition 
law [Eq. (17.8)] to the mean value and variance of the random variable Y, 
assuming first P(Y = aX + b) = 1, so that P(Y ^ aX + b) = 0. This yields 

(Y) = P(Y — aX + b)(Y\Y = aX + b) 

+P(Y ±aX + b)(Y\ Y^aX + b) 

= (aX + b) = a{X) + b, 

o(Yf = P(Y = aX+b)([Y - {Y)f\Y = aX+b) 

+ P(Y aX + b)([Y - (Y)f\Y ^aX+b) 

= <[ aX+b-(Y)f) = {a 2 [X — (W)] 2 } = a 2 a{X) 2 , 

substituting (Y) — a(X) + b. Similarly, we obtain cov(X, Y) = a 2 o(Xf. These 
results show that the correlation is ±1. 

Conversely, we start from cov(W, Y) 1 — o(Xf o(Y) 1 . Hence, the quadratic 
form in Eq. (17.27) must be zero for (practically) all x for some (uq, Vq) ^ (0, 0) 

([ Wfl (Z-(W»+r 0 (F-(7»] 2 >=0. 

Because the argument of this mean value is positive definite, this relationship 
is satisfied only if P(uq(X — (X) ) + vq(Y — (F>) = 0) = 1, which means that Y 
and X are linearly related. 

When we integrate out one random variable we are left with the probability 
distribution of the other random variable 

F{pc) = J fix, y)dy, or Giy) = J fix, y)dx (17.28) 

and analogously for discrete probability distributions. When one or more 
random variables are integrated out, the remaining probability distribution 
is called marginal, motivated by the geometric aspects of projection. It is 
straightforward to show that these marginal distributions satisfy all the re¬ 
quirements of properly normalized probability distributions. 

If we are interested in the distribution of the random variable X for a definite 
value y = yo of the other random variable, then we deal with a conditional 
probability distribution P(W = x\Y — y 0 ). The corresponding continuous 
probability density is fix, yo). 


EXAMPLE 17.2.4 


Repeated Draws of Cards When we draw cards repeatedly, we shuffle the 
deck often because we want to make sure these events stay independent. So 
we draw the first card at random from a bridge deck containing 52 cards and 
then place it back at a random place. Now we repeat the process for a second 
card. Then the deck is reshuffled, etc. We now define the random variables 
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• X = number of so-called honors; that is, 10s, jacks, queens, kings, or aces; 

• Y = number of 2s or 3s. 

In a single draw the probability of a 10 to ace is a = 5 ■ 4/52 = 5/13, and 
b — 2 • 4/52 = 2/13 for 2 or 3 to be drawn and c = (13 — 5 — 2)/13 = 6/13 for 
anything else with a + b + c = 1. 

In two drawings X and Y can be x — 0 = y, when no 10 to ace show up nor 
2 or 3. This case has probability c 2 . In general, X, Y can be any combination 
of 0, 1, or 2 so that 0 < x + y < 2 because we will hold two cards at the 
end. The probability function of (X = ,r, Y = y ) is given by the product of 
the probabilities of the three possibilities a x , b y , c 2 ~ x ~ v times the number of 
distributions (or permutations) of two cards over the three cases with proba¬ 
bilities a, b, c, which is 2\/[x\y\(2. — x — ?/)!]. This number is the coefficient of 
the power a x b y c 2 ~ x ~ y in the generalized binomial expansion of all possibilities 
in two drawings with probability 1: 

1 = (a + b + cf = V -—- a x b y c 2 ~ x ~ v (17.29) 

o<^<2 x\y\(2 -x-y)\ 

= a 2 + b 2 +c 2 + 2 (ab + ac + be). 

Hence, the probability distribution of our discrete random variables is given 
by 


f{X = x,Y = y) = 


2 ! 

x\y\( 2 . — x — y)\ 



2 -oc-y 


x, y = 0, 1, 2; 0 < x + y < 2 


(17.30) 


or more explicitly as 


/(0, 0) = 


6 y 

13/ ’ 


/ ( 2 ’°)=<T3 


2 \ 


/(l, 0) = 2 ■ 


6 


60 


13 13 13 2 ’ 


2 6 24 

/( 0 , 1 ) = 2 -= —, 

jy , j 13 13 132’ 

20 


/( 0 , 2 ) = - , /( 1 , 1 ) = 2 ■ - • - = — 


13/ 


13 13 13 2 ' 


The probability distribution is properly normalized according to Eq. (17.29). 
Its expectation values are given by 


<*>= E x f( x > y) — /( x > °) + /(i> 1) + 2/(2, 0) 


0<x+y<2 

60 20 / 5 \ 2 130 10 „ 

_ l32 + 132 + (13 J ~ W ~ 13~ U 
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(Y)= J2 y/(^) = /(°,l)+ /(U)+ 2/(0, 2) 

0<x+y<2 

24 20 n /2 \ 2 52 4 

_ 132 + 132 + (13J _ 132 _ 13 _ b ’ 

as expected because we are drawing a card two times. The variances are 

<*\X) = f ( ' X ’ 

0<x+y<2 ' ' 

= (1) [/(°> °) + /(0, !) + m 2)] 

+ (4) [/( 1 >°) + /( 1 > 1 )]+(^) /(2,0) 

_ 10 2 ■ 64 + 3 2 • 80 + 16 2 ■ 5 2 _ 4 2 ■ 5 • 169 _ 80 
13 4 “ 13* “ 132’ 


cr 2 (Y)= 6 /- 4) f&y) 

0<x+y<2 \ / 

= (4) [/(°>°) + /(i,0) + /(2,0)] 

+ (E) [mi)+/ai)] + (D /(o, 2) 

4 2 • ll 2 + 9 2 • 44 + 22 2 • 2 2 11-4-169 44 

~ 13 4 ~~ 13* _ 13 2 ’ 

It is reasonable that < 7 2 (T) < cr 2 (X) because Y takes only two values 2, 3, 
whereas X varies over the five honors. The covariance is given by 


E 

0 <x+y<2 

^ x — 

S)(« 


j f(x, y) 



10-4 

6 2 

10-9 

24 

10-22 4 

3-4 

60 

13 2 

13 2 

13 2 

13 2 

13 2 13 2 

13 2 

13 2 

3-9 

20 

16-4 

5 2 

20■169 

20 


+ 13 2 

13 2 

13 2 

13 2 

13 4 

13 2 



Therefore, the correlation of the random variables X, Y is given by 
cov(X, Y) 20 


*(X)a(Y) 


, _= -^77 = -0-3371, 

STiTTI 2 V 11 
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which means that there is a small (negative) correlation between these random 
variables because if a card is an honor it cannot be a 2 or 3 and vice versa. 
Finally, let us determine the marginal distribution 

2 

F(X=x) = Y l f(F,y) (17-31) 

y= o 


or explicitly 

f (0) = m 0)+/CO, 1 )+m 2) = (l) 2 +§ + (4) 2 = (I) 2 ■ 

60 20 80 
TO = /a- 0) + /(l, D = w + w = w , 

TO = /(2,0) = , 


which is properly normalized because 


TO + F(l) + TO 


64 + 80 + 25 

lip 


169 

IP" 


1 . 


Its mean value is given by 


(X) f = J2 xF(x) = TO + 2TO = 


x—0 


80 + 2-25 
13 2 


130 = 10 = (X) 
132 13 ' ’’ 


and its variance 



From the definitions, it follows that these results hold generally. ■ 


Finally, we address the transformation of two random variables X, Y into 
U(X, Y), V(X, Y). We treat the continuous case, leaving the discrete case as 
an exercise. If 


u — u(pc , y ), v = v(x, y ); x = x(u, v ), y — y(u, r), (17.32) 

describe the transformation and its inverse; then the probability stays invariant 
and the integral of the density transforms according to the rules of Chapter 2 
so that the transformed probability density becomes 


g(u, v) = f(x(u, v), y(u, n))| J\, 


(17.33) 
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EXAMPLE 17.2.5 


with the Jacobian 


J = 


y) 

3 Qu, d) 


dx 

dx 

3 u 

dv 

3y 

ay 

3 u 

dv 


(17.34) 


Sum, Product, and Ratio of Random Variables Let us consider three ex¬ 
amples. First, the sum Z = X + Y, where the transformation may be taken 
to be 


x = x, z = x+ y, J = 


1 1 
0 1 


using 


to _ l 3 {z- y) _ l dy _ o d(z-x) 
dx 3 z dx dz 


= 1 . 


so that the probability is given by 


F{Z) = 


n 

J —oo J —c 


f(x, z— x~)dxdz. 


(17.35) 


/ —oo J — oo 

If the random variables X , Y are independent with densities fi, fa, then 


F(Z) = 


-n 

J —oo J—oo 


fi ( x)f 2 (z — x ) dx dz. 


(17.36) 


Second, the product Z = XY taking X, Z as the new variables leads to the 
Jacobian 


J = 


1 i 

y 

0 i 


1 

") 
x 


usmg 


dx 

dx 


d (§) = 1 9y = o 

dz y dx 


Hi) 


dz X 


so that the probability is given by 

f(z)= f z r 

J —oo J —c 


. dx 
f\ x > ~ )VY dz - 


x) \x\ 

If the random variables X, Y are independent with densities fi, f 2 , then 

r-Z 


(17.37) 


F(Z) = 


-ff 


/10*0/2! -)rr d2! - 

\x \x\ 


(17.38) 


Third, the ratio Z = y taking Y, Z as the new variables has the Jacobian 


J = 


z y 

1 0 


- y , 
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using 

d(yz) _ ~ d(yz) _ dy dy _ 

dy dz dy dz 

so that the probability is given by 

/ Z /»oo 

/ Ayz, y)\y\dydz. (17.39) 

-oo J —oo 


If the random variables X, Y are independent with densities f\, f 2 , then 


Z poo 


F{Z) = 


-/ / 

«/— 00 J —00 


Myz)f 2 (_y)\y\dydz. 


(17.40) 


EXERCISES 

17 . 2.1 Show that adding a constant c to a random variable X changes the 
expectation value {X) by that same constant but not the variance. Show 
also that multiplying a random variable by a constant multiplies both 
the mean and variance by that constant. Show that the random variable 
X — {X) has mean value zero. 

17 . 2.2 If (X), (Y) are the average values of two independent random variables 
X, Y, what is the expectation value of the product X ■ Y? 

17 . 2.3 A velocity vj — Xj/tj is measured by recording the distances Xj at the 
corresponding times tj. Show that oc/t is a good approximation for the 
average velocity v provided all the errors \Xj—x\ <$; \x\ and \ tj — t\ <$: \t\ 
are small. 

17 . 2.4 Define the random variable Y in Example 17.2.4 as the number of 4, or 
5, or 6, or 7, or 8, or 9. Then determine the correlation of the X and Y 
random variables. 

17 . 2.5 If X and Y are two independent random variables with different prob¬ 
ability densities and the function f(x, y) has derivatives of any order, 
express {f(X, Y)) in terms of {X) and (Y). Develop similarly the co- 
variance and correlation. 

17 . 2.6 Let f(x, y) be the joint probability density of two random variables 
X, Y. Find the variance o 2 (aX + bY ), where a, b are constants. What 
happens when X, Y are independent? 

17 . 2.7 The probablility that a particle of an ideal gas travels a small distance dx 
between collisions is ~e~ x Fdx, where / is the constant mean free path. 
Verify that / is the average distance between collisions and determine 
the probability of a free path of length l >3/. 

17 . 2.8 Determine the probability density for a particle in simple harmonic 
motion in the interval — A < x < A. 
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Hint. The probability of the particle being between x and x + dx is 
proportional to the time it takes to travel across the interval. 



probability a = 1/6, and anything else has probability b — 5/6 with a + b — 1. 
Let the random variable X = x be the number of 6s. In four tosses, 0 < 
x < 4. The probability distribution /(X) is given by the product of the two 
possibilities, a x and b 4 ~ x , times the number of combinations of four tosses 
over the two cases with probabilities a, b. This number is the coefficient of the 
power a x b 4 ~ x in the binomial expansion of all possibilities in four tosses with 
probability 1: 


1 = (a + b) 4 = ,r/| 4 * 

^ a;!(4 - a}! 

= a 4 + b 4 + 4a 3 b + 4ab 3 + 6a 2 b 2 . (17.41) 

Hence, the probability distribution of our discrete random variable is given by 

/(X = x)= —a x b 4 ~ x , 0 < x < 4 
a;!(4 - x)\ 

or more explicitly 

/(0) = b\ /(1) = 4 ab 3 , /(2) = 6 a 2 b 2 , /(3) = 4 a 3 b, /(4) = a 4 . 

The probability distribution is properly normalized according to Eq. (17.41). 
The probability of three 6s in four tosses is 4a, 3 b — 4 = 4 | 4 , which is fairly 

small. ■ 

This case dealt with repeated independent trials, each with two possible 
outcomes of constant probability p for a hit and q — 1 — p for a miss, and is 
typical of many applications, such as defective products, hits or misses of a 
target, and decays of radioactive atoms. The generalization to X = x successes 
in n trials is given by the binomial probability distribution 

/(X = x)= --- p x q n ~ x = ( | p x q n ~ x (17.42) 

J x\(n-xy \x) 

using the binomial coefficients (see Chapter 5). This distribution is normalized 
to the probability 1 of all possibilities in n trials, as can be seen from the 
binomial expansion 


1 = (p + q) n — p n + np" l q H-h npq n 1 + q n . 


(17.43) 
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Figure 17.5 

Binomial Probability 
Distributions for 
n = 20 and 
p — 0.1, 0.3, 
and 0.5 



Figure 17.5 shows typical histograms. The random variable X takes the values 
0, 1, 2, ..., n in discrete steps and can also be viewed as a composition X-, 
of u-independent random variables X; , one for each trial, that have the value 
0 for a miss and 1 for a hit. This observation allows us to employ the moment 
generating functions 

(< e tXi ) = P(Xi = 0) + e'P(Z i; = 1 ) = q + pe t (17.44) 


and 

(e tx ) = ]\{e tXi ) = (pe 1 + q) n , (17.45) 

i 


from which the mean values and higher moments can be read off upon differ¬ 
entiating and setting t = 0. Using 


9(e i 


t.X\ 


dt 


die 1 


,tX\ 


dt 


t =0 


npeXpet + qJ 1 1 , 

{X) = Y] x if(Xi) = np, 


d 2 (e tx ) 

dt 2 


npe\pe l + q) n 1 + n(n - l')p 2 e 2 \pe t + q) n 2 , 


(X 2 ) = 


d 2 (e tx ) 
dt 2 


t =0 


Y = np + n(n - l)p 2 , 

i 


we obtain, with Eq. (17.17), 


cr 2 (Z) = {X 2 ) - {X) 2 = np + n(n — l)p 2 — n 2 p 2 


= np(l — p) — npq. 


(17.46) 
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Figure 17.5 illustrates these results with peaks at ( X) = np = 2, 6, 10, which 

widen with increasing p. 

EXERCISES 

17 . 3.1 Show that the variable X = x number of heads in n coin tosses is a 
random variable and determine its probability distribution. Describe 
the sample space. What are its mean value, the variance, and standard 
deviation? Plot the probability function f(x) = n\/[x\(n — x)\2 n ] for 
n — 10, 20, 30 using graphical software. 

17 . 3.2 Plot the binomial probability function for the probabilities p = 1/6, q = 
5/6, and n — 6 throws of a die. 

17 . 3.3 A hardware company knows that mass producing nails includes a small 
probability, p = 0.03, of defective nails (without a sharp tip usually). 
What is the probability of finding more than 2 defective nails in its 
commercial box of 100 nails? 

17 . 3.4 Four cards are drawn from a shuffled bridge deck. What is the prob¬ 
ability that they are all red? That they are all hearts? That they are 
honors? Compare the probabilities when the cards are put back at ran¬ 
dom places, or not. 

17 . 3.5 Show that for the binomial distribution of Eq. (17.42), the most probable 
value of x is np. 


17.4 Poisson Distribution 


The Poisson distribution typically occurs in situations involving an event re¬ 
peated at a constant rate of probability. The decay of a radioactive sample is a 
case in point. If the observation time dt is small enough so that the emission 
of two or more particles is negligible, then the probability that one particle 
(He 4 in a decay or an electron in p decay) is emitted is // dt with constant p 
and \xdt. <$; 1. We can set up a recursion relation for the probability P n (t) of 
observing n counts during a time interval t. For n > 0 the probability P m (£ + dt ) 
is composed of two mutually exclusive events that (i) n particles are emitted 
in the time t, none in dt, and (ii )n — 1 particles in time t, one in dt. Therefore, 

P n (t + dt) = P n (t)Po(dt) + P n _i(t)Pi(dt). 

Here, we substitute the probability of observing one particle Pi(dt) = // dl and 
no particle Po(d£) = 1 — P\ (dt) in time dt. This yields 

P n (.t + dt) = P n (t)(. 1 - d dt) + P n -i(t)ii dt 


so that, after rearranging and dividing by dt, we get 


dP„(t) 


P n (t + dt) - P n (t) 


lxP n -i(t) - pP n (t). 


dt 


dt 


(17.47) 
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For n = 0 this differential recursion relation simplifies because there is no 
particle in times t and dt, giving 

= -umi (17-48) 

dt 

This ordinary differential equation (ODE) integrates to Putt) = er ILt if the 
probability that no particle is emitted during a zero time interval Putt)) — 1 is 
used. 

Now we go back to Eq. (17.47) for n = 1 

Pi = ti(e-' u - P0, Pi(0) = 0, (17.49) 

and solve the homogeneous equation that is the same for Pi as Eq. (17.48). This 
yields I\(t) = ii\er llf . Then we solve the inhomogeneous ODE [Eq. (17.49)] 
by varying the constant /zi to find jx\ — // so that Pi(£) = liter 1 ' 1 . The general 
solution is a Poisson distribution 

fntY 1 

Putt) = (17.50) 

n\ 

as may be confirmed by substitution into Eq. (17.47) and verifying the initial 
conditions P„(0) = 0, n > 0. 

The Poisson distribution is defined with the probabilities 

LL n 

p(n) = —e- M , X = n= 0, 1, 2,... (17.51) 

n\ 

and is exhibited in Fig. 17.6. The random variable X is discrete. The probabil¬ 
ities are properly normalized because er 1 ' 0 ^ = 1. The mean value and 

Figure 17.6 

Poisson Distribution 
for it = s / 2 
Compared with 
Binomial 
Distribution for 
P = 3 /20>« = 10 
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variance, 


OO 


OO Yl 

[X) = e~» E = Me-" E X7 = 


. to ! “ to ! 

n=l n=0 

a 2 = (X 2 ) — (xy — p(p + 1) - p 2 = fi, 


(17.52) 


follow from the characteristic function 


(e itx ) = E = e ~ E 


(jie lt y i _ g n(&- 1 ) 


n=0 


??=0 


to ! 


by differentiation and setting £ = 0, using Eq. (17.17). 

A Poisson distribution becomes a good approximation for the binomial 
distribution for a large number to of trials and small probability p ~ p/to, pa 
constant. 


THEOREM 


In the limit n —»• oc and p -* 0 so that the mean value rip -* p stays 
finite, the binomial distribution becomes a Poisson distribution. 


To prove this theorem, we apply Stirling’s formula to! ~y/2nn(n/e') n for 
large to to the factorials in Eq. (17.42), keeping x finite while to — > oc. This 
yields for n—* oo: 


to ! 

(to — x)\ 




n—x 


X 


n — x 


- I e x ~ w*, 


and for to -» oo, p —> 0, with np -> p: 

(1 - pf- x ~ f 1 - — 

V TO 

Finally, p x n x -> p x ; thus, altogether 
to! 


1-* 

TO 


™^oo, 

ad(TO — a;)! a:! 


(17.53) 


which is a Poisson distribution for the random variable X = a; wit h 0 < a; < oo. 
This limit theorem is an example of the laws of large numbers. 


EXERCISES 

17 . 4.1 Radioactive decays are governed by the Poisson distribution. In a 
Rutherford-Geiger experiment the number to,; of emitted a particles are 
counted in to = 2608 time intervals of 7.5 seconds each. In Table 17.1, 
TOj is the number of time intervals in which i particles were emitted. 
Determine the average number X of emitted particles and compare the 
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rii of Table 17.1 with npi computed from the Poisson distribution with 
mean value X. 


Table 17.1 


i —► 

0 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

Ui -> 

57 

203 

383 

525 

532 

408 

273 

139 

45 

27 

16 


17 . 4.2 Derive the standard deviation of a Poisson distribution of mean 
value /i. 

17 . 4.3 The number of a decay particles of a radium sample is counted per 
minute for 40 hours. The total number is 5000. How many 1-minute 
intervals are there with (a) 2, (b) 5 a particles? 

17 . 4.4 For a radioactive sample, 10 decays are counted on average in 100 
seconds. Use the Poisson distribution to estimate the probability of 
counting 3 decays in 10 seconds. 

17 . 4.5 238 U has a half-life of 4.51 x 10 9 years. Its decay series ends with the 
stable lead isotope 206 Pb. The ratio of the number of 206 Pb to 238 U atoms 
in a rock sample is measured as 0.0058. Estimate the age of the rock 
assuming that all the lead in the rock is from the initial decay of the 
238 U, which determines the rate of the entire decay process because 
the subsequent steps take place far more rapidly. 

Hint. The decay constant X in the decay law N(t) = Ne~ u is related to 
the half-life T by T = In 2 /a. 


ANS. 3.8 x 10 7 years. 

17 . 4.6 The probability of hitting a target in one shot is known to be 20%. If 
five shots are fired independently, what is the probability of striking the 
target at least once? 

17 . 4.7 A piece of uranium is known to contain the isotopes 2 ||U and 2 ||U apart 
from 0.80 g of 2 gf Pb per gram of uranium. Estimate the age of the piece 
(and thus Earth) in years. 

Hint. Find out first which uranium isotope is the relevant one for the 
decay into lead. Use the decay constant from Exercise 17.4.5. 


17.5 Gauss’s Normal Distribution 


The bell-shaped Gauss distribution is defined by the probability density 

f(_x) = exp ^ ) , — oo < x < oo, (17.54) 

with mean value // and variance er 2 . It is by far the most important continuous 
probability distribution and is displayed in Fig. 17.7. 
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Figure 17.7 

Normal Gauss 
Distribution for Mean 
Value Zero and Various 
Standard Deviations 
h = l/cr-v/2 



It is properly normalized because substituting y = 2-j|, we obtain 

<7 v2 


i r° 

\Z2jt J~c 


_ (x-iif 

e i*' 1 dx 


I r c 

J- c 


e 


a \/2 tt J —oo J 

Similarly, substituting y= x — jx, we see that 


- f c - - 

= —— / e v dy — 1. 

J 0 


(X) - y = 


/_ 


X — /l _ (■>'-d ) 2 

= 6 rf/T = 


— OO <7 


\[2n 


/_ 


oo <7 


V2jr 


e 2 ^dy = 0, 


the integrand being odd in y so that the integral over ?/ > 0 cancels that over 
y < 0. Similarly, we check that the standard deviation is cr. 

From the normal distribution (by the substitution y = A ' X] ) 


PQX- (X)| > fca) = P 


|JV-(X>| 


f 


> fc ) = P(|F| > fc) 
4 


2/2 dy = 


i 


fc 


e Z ~dz= erfc- , 
/c/V2 V2 


we can evaluate the integral for fc = 1, 2, 3 and thus extract the following 
numerical relations for a normally distributed random variable: 


P(|X - {X)\ > a) ~ 0.3173, PQX ~(X)\> 2a) ~ 0.0455, 

P(|X - (X) | > 3a) ~ 0.0027, (17.55) 

of which the last one is interesting to compare with Chebychev’s inequality 
[Eq. (17.21)], giving <1/9 for an arbitrary probability distribution instead of 
~0.0027 for the 3a rule of the normal distribution. 
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THEOREM 


Addition Theorem If the random variables X, Y have the same normal 
distributions, that is, the same mean value and variance, then Z = X+Y 
has normal distribution with twice the mean value and twice the variance 
of X and Y. 


To prove this theorem, we take the Gauss density as 

f(x) = —=e~ x2/2 , with —f e~ x2/2 dx= 1, 

V27T \2i7T J— oo 

without loss of generality. Then the probability density of ( X , Y) is the product 


ffr, V) = 


1 


-e 


-* 2 / 2 _ 


1 


1 

2?r 


y/2n 

Also, Eq. (17.36) gives the density for Z = X +Y as 


e -y 2 /2 = J_ e -tx 2 +y 2 Y 2 


g(z) = 


r°° l 

J — OO V2) 


-x 2 /2 


I—o >o \[Zk 

Completing the square in the exponent 


_!_ e (x z ?/ 2 d X ' 


2a; 2 - 2xz + z 2 = (xV2 - -^=j + 

we obtain 

= h ^' 11 C exp (' ' ^)) *’• 

Using the substitution u = x — |, the integral transforms into 



so that the density for Z = X + Y is 


g(z) = (17.56) 

which means it has mean value zero and the variance 2, twice that of X and Y. 

In a special limit the discrete Poisson probability distribution is closely 
related to the continuous Gauss distribution. This limit theorem is another 
example of the laws of large numbers that are often dominated by the bell¬ 
shaped normal distribution. 
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THEOREM 


For large n and mean value p, the Poisson distribution approaches a 
Gauss distribution. 


To prove this theorem for n -* oo, we approximate the factorial in the 
Poisson’s probability p(n) of Eq. (17.51) by Stirling’s asymptotic formula (see 
Chapter 5), 


n\ ~ V2mt 



n —> oo, 


and choose the deviation ii = n — p from the mean value as the new variable. 
We let the mean value p —> oo and treat v/p as small but v 2 /p as finite. 
Substituting n = p + v and expanding the logarithm in a Maclaurin series 
keeping two terms, we obtain 


In p(ri) = —p + n In p — nlnw + n — In s/2 mt 

= (p + t>) In p — (p + t>) ln(/r + r) + v — In s/27r(/r + r) 

= (p + i>) In ( 1-) + v — In xp 

\ M + vj 

= (P + v) ( - V - V 2 ) +v-lny&rp 

\ p + v 2 (ji + vY J 

v 2 , - 

~-In ^2 up, 

2 p 

discarding In \Z2itv because \ v\ « p. Exponentiating this result, we find that 
for large n and p 

p(n)-r J— e~“^, (17.57) 

V 2tc p 

which is a Gauss distribution of the continuous variable v with mean value 0 
and standard deviation a = ^fp. 

In a special limit the discrete binomial probability distribution is also 
closely related to the continuous Gauss distribution. This limit theorem is 
another example of the laws of large numbers. 


THEOREM 


In the limit n —>■ oo, so the mean value np — » oo, the binomial distribution 
becomes Gauss’s normal distribution. Recall from Section 17.4 that when 
np p < oo, the binomial distribution becomes a Poisson distribution. 


Instead of the large number x of trials, we use the deviation v = x — pn 
from the (large) mean value pn as our new continuous random variable under 
the condition that |n | ^ pn, but v 2 /n is finite as n oo. Thus, we replace x 
by v + pn and n — x by qn — v in the factorials of Eq. (17.42), f(_x) —> W(v) as 
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n —> oo, and then apply Stirling’s formula. This yields 

p%q n—Xyjti +1 /2 g—n+x+(n—x) 


W(v) = 


«j2n(v + pn) x+1 / 2 (jqn — v )™- x + 1 / 2 


Here, we factor out the dominant powers of n and cancel powers of p and q 
to find 


WOO = , ( 1 + — 

V2Ttpqn \ pn 
In terms of the logarithm, we have 
1 


~(v+pn+ 1 / 2 ) 


1 - — 
qn 


-(gn-v+l/2) 


In W(v) = In 


y/2jtpqn 


— (v + pn+ 1/2) In 1 H- 

pn 


(qn — v + 1/2) In 1- 

qn 


= In 


1 


y/2npqn 


(r + pn+ 1/2) 


/ v v 2 
\pn 2 p 2 n 2 


, v v 

-(gn-v + 1/2)- 


qn 2 q 2 n 2 


= In 


1 


V2tt pqn 


v ( 1 


1 


1 


1 


where 


n \2p 2q J n \2p 2q 

v/n -»■ 0, v 2 /n 


is finite and 


1 1 

2 p + 2 q 


p + q 


1 


2 pq 2 pq 


Neglecting higher orders in v/n, such as v 2 /p 2 n 2 , v 2 /q 2 n 2 , we find the large n 
limit 

W{v)=-==e~^, (17.58) 

~J2itpqn 

which is a Gaussian distribution in the deviations x—pn with mean value 0 and 
standard deviation a = y/npq. The large mean value pn (and the discarded 
terms) restricts the validity of the theorem to the central part of the Gaussian 
bell shape, excluding the tails. 


EXERCISES 

17 . 5.1 What is the probability for a normally distributed random variable to 
differ by more than 4cr from its mean value? Compare your result with 
the corresponding one from Chebychev’s inequality. Explain the differ¬ 
ence in your own words. 
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17 . 5.2 Let X\, Xo, ..., X n be independent normal random variables with the 

- 9 V X /n—x 

same mean x and variance a . Show that ’ , -is normal with mean 

■sjruj 

zero and variance 1. 

17 . 5.3 An instructor grades a final exam of a large undergraduate class, 
obtaining the mean value of points M and the variance a 2 . Assuming 
a normal distribution for the number M of points, he defines a grade 
F when M < to — 3a/2, D when to — 3er/2 < M < to — rr/2, C when 
to — ct/2 < M < m+ <r/2, B when m+ er/2 < M < m+ 3cr/2, and A 
when M > m+ 3o/2. What is the percentage of A, F; B, D; C? Redesign 
the cutoffs so that there are equal percentages of As and Fs (5%), 25% 
B and Ds, and 40% Cs. 

17 . 5.4 If the random variable X is normal with mean value 29 and standard 
deviation 3, what are the distributions of 2X — 1 and 3X + 2? 

17 . 5.5 For a normal distribution of mean value to and variance a 2 , find the 
distance r that half the area under the bell shape is between m— r and 
m+r. 


17.6 Statistics 


In statistics, probability theory is applied to the evaluation of data from random 
experiments or to samples to test some hypothesis because the data have 
random fluctuations due to lack of complete control over the experimental 
conditions. Typically, one attempts to estimate the mean value and variance 
of the distributions, from which the samples derive, and generalize properties 
valid for a sample to the rest of the events at a prescribed confidence level. Any 
assumption about an unknown probability distribution is called a statistical 
hypothesis. The concepts of tests and confidence intervals are among the most 
important developments of statistics. 


Error Propagation 


When we measure a quantity x repeatedly obtaining the values Xj at random, 
or select a sample for testing, we determine the mean value [Eq. (17.10)] and 
the variance 


n \ n 

x = - Y Xj, a 2 — - y (Xj — x) 2 

n 4—f n 4—f 

3 =i 3 =i 

as a measure for the error or spread from the mean value x. We can write 
Xj — X + ej, where the error e, j is the deviation from the mean value, and we 
know that • e.j = 0. [See the discussion after Eq. (17.15).] 

Now suppose we want to determine a known function f(x ) from these 
measurements; that is, we have a set fj = f (.%;/) from the measurements of x. 
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Substituting Xj = x + e,j and forming the mean value from 

/ = - E = - E + e f> 

n “ n *— J 

3 3 

= /c®) + V© E E e i + • • • 

3 3 

= fix) + E 2 /"(x) + • • •, (17.59) 

we obtain the average value / as fix) in lowest order as expected. However, 
in second order there is a correction given by half the variance with a scale 
factor fix). It is interesting to compare this correction of the mean value 
with the average spread of individual fj from the mean value /, the variance 
of /. To lowest order, this is given by the average of the sum of squares of the 
deviations, in which we approximate f j ~ / + fixfj, yielding 

<r\f) = l ~ E(/f - ff = ifi^f \ E e i = C fix)fcr 2 . (17.60) 

3 3 

In summary, we may formulate somewhat symbolically 

fix± a) = fix) ± fix) a 

as the simplest form of error propagation by a function of one measured 
variable. 

For a function f(Xj, y k ) of two measured quantities Xj = x+Uj, y k = 
y+V) c, we similarly obtain 

1 r s r s 

J = EE^ fe = ^ EE fi*+ u J’ y+ v <J 

' j=u=i j=i k=i 

= fix, y) + -fx E U J + ~fx E Vk + "' ’ 

r 3 S k 

where JT ■ Uj = 0 = f k v k so that again / = fix, y) in lowest order. Here, 

fx = ^-ix, y), fy = xf-ix, y) (17.61) 

dx 3 y 

denote partial derivatives. The sum of squares of the deviations from the mean 
value is given by 

E E<x* - ff = £(«,/. +v k fy) 2 = sf* u) + rf* J2 v l 

j =1 k =1 j,k j k 
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because k u i Vk = 12 / u i 12 k Vk = 0- Therefore, the variance is 


1 


A/) rs E^i* - ^ fr* + ffc 


(17.62) 


j,k 


with f x , f y from Eq. (17.61), and 


2 1 \ ' 2 2 ^ \ 2 

°af — “ U j> a y ~ ~ v k 

3 S k 


are the variances of the x and y data points. Symbolically, the error propagation 
for a function of two measured variables may be summarized as 

f(x ±(T x ,y± ay) = f(x, y) ± y fx a x + fy°y ■ 

As an application and generalization of the last result, we now calculate 
the error of the mean value x = i I2j=i x :i °f a sample of n individual measure¬ 
ments Xj, each with spread a. In this case, the partial derivatives are given by 
f x = - = f y = ■ • • and Oj = a = a y = ■ ■ ■. Thus, our last error propagation rule 
tells us that errors of a sum of variables add quadratically, so the uncertainty 
of the arithmetic mean is given by 

d - —Vna 2 = —=, (17.63) 

n A /n 

decreasing with the number of measurements n. 

As the number n of measurements increases, we expect the arithmetic 
mean x to converge to some true value x. Let x differ from a; by a and Vj = Xj—x 
be the true deviations; then 

~ ^> 2 = E e ) = E v )+ n “ 2 - 

3 3 3 


Taking into account the error of the arithmetic mean, we determine the spread 
of the individual points about the unknown true mean value to be 

" 2 = -Ee? = -E«? + « 2 - 

n 3 n t-T* 3 

3 3 


According to our earlier discussion leading to Eq. (17.63), a 2 
result, 


1 ’T—\ p G 

— - /J v j H- j 

n J n 


in 2 . 


As a 


from which the standard deviation of a sample in statistics follows: 


a = 


/M 

V n— 1 


'EiO Vj-xf 


n- 


1 


(17.64) 


with n — 1 being the number of control measurements of the sample. This 
modified mean error includes the expected error in the arithmetic mean. 
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Because the spread is not well defined when there is no comparison mea¬ 
surement (i.e., n — 1), the variance is sometimes defined by Eq. (17.64) replac¬ 
ing the number n of measurements by the number n— 1 of control measure¬ 
ments in statistics. 


Fitting Curves to Data 


Suppose we have a sample of measurements yj (e.g., a particle moving freely— 
that is, no force) taken at known times tj (which are taken to be practically 
free of errors; that is, the time t is an ordinary independent variable) that we 
expect to be linearly related as y = at, our hypothesis. We want to fit this fine 
to the data. 

First, we minimize the sum of deviations J2j(atj — yj f to determine the 
slope parameter a, also called regression coefficient, using the method of least 
squares. Differentiating with respect to a, we obtain 


2 T. iatj yj)tj — 0, 

3 


from which 


a = 


EjtjVj 


(17.65) 


follows. Note that the numerator is built like a sample covariance—the scalar 
product of the variables t, y of the sample. As shown in Fig. 17.8, the measured 
values yj do not lie on the line as a rule. They have the spread (or root mean 
square deviation from the fitted line) 


a = 


£/%■ - atj ) 2 


n- 


1 


Alternatively, let the yj values be known (without error) while tj are mea¬ 
surements. As suggested by Fig. 17.9, in this case, we need to interchange 
the role of t and y and fit the line t = by to the data points. We minimize 


Figure 17.8 

Straight Line Fit to 
Data Points (tj, yj ) 
with tj Known, yj 
Measured 


y 
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Figure 17.9 

Straight Line Fit to 
Data Points (tj, yj) 
with jjj Known, tj 
Measured 


y 



Figure 17.10 

(a) Straight Line Fit to 
Data Points (tj , yj). 

(b) Geometry of 
Deviations Uj , vj, dj 



H j(Pyj ~ tj) 2 , set the derivative with respect to b equal to zero, and similarly 
find the slope parameter 

b = (17.66) 

In case both tj and yj have errors (we take t and y to have the same 
units), we have to minimize the sum of squares of the deviations of both vari¬ 
ables and fit to a parameterization t sin a — y cos a = 0, where t and y occur 
on an equal footing. As displayed in Fig. (17.10a), this means geometrically 
that the line has to be drawn so that the sum of the squares of the distances 
dj of the points (tj, yj) from the line becomes a minimum. (See Fig. 17.10b 
and Chapter 1.) Here, dj = tj sin a — yj cos a, so d 2 = minimum must be 
solved for the angle a. Setting the derivative with respect to the angle equal to 
zero, 

y ^(tj sin a — yj cos a)(tj cos a + yj sin a) = 0 
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yields 


sin a cos a ^ (t| — yj) — (cos 1 2 * a — sin 2 a) ^ tjyj = 0. 

j 3 

Therefore, the angle of the straight line fit is given by 

2 Li ^ 26- 


tan 2 a = 


EA‘i-yf)' 


(17.67) 


SUMMARY 


This least squares fitting applies when the measurement errors are unknown. It 
allows assigning at least some kind of error bar to the measured points. Recall 
that we did not use errors for the points. Our parameter a (or a) is most likely 
to reproduce the data in these circumstances. More precisely, the least squares 
method is a maximum likelihood estimate of the fitted parameters when it is 
reasonable to assume that the errors are independent and normally distributed 
with the same deviation for all points. This fairly strong assumption can 
be relaxed in “weighted” least squares fits called chi square fits. 4 (See also 
Example 17.6.1.) 


The x 2 Distribution 

This distribution is typically applied to fits of a curve y(t, a, .. .) with parame¬ 
ters a, ... to data tj using the method of least squares involving the weighted 
sum of squares of deviations; that is, 

2 _ ( Vj — y(Pj> ffl > • ■ -) V 

X a 26 ) 

is minimized, where N is the number of points and r the number of adjusted 
parameters a,.... This quadratic merit function gives more weight to points 
with small deviation A yj. 

We represent each point by a normally distributed random variable X with 
zero mean value and variance ct 2 = 1, the latter in view of the weights in the x 2 
function. In a first step, we determine the probability density for the random 
variable Y = X 2 of a single point that takes only positive values. Assuming a 
zero mean value is no loss of generality because if {X) — m ^ 0, we would 
consider the shifted variable Y = X — m, whose mean value is zero. We show 
that if X has a Gauss normal density 

1 _* 2 

f{0C ) = 7= e ^ < X < OO, 

erv27r 

then the probability of the random variable Y is zero if y < 0, and 

P(F < y) = PiX 2 < y) = Pi- Jy < X < ^fy) if y > 0. 


4 For more details, see Chapter 14 of Press et al. in Additional Reading of Chapter 8. 
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From the continuous normal distribution P(t/) = M f(x)dx, we obtain the 
probability density g(?/) by differentiation: 

0(2/) = - P(-Vy)] = ^=(/Cv® + /(-V®) 


r^2jty 


e 2^, y > 0 . 


(17.68) 


This density, ~e 2,2 corresponds to the integrand of the Euler integral of 
the gamma function. Such a probability distribution 


0 ( 2 /) = 


y 


jP- 1 


r(p)(2CT 2 )p 


e 2„2 


is called a gamma distribution with parameters p and a. Its characteristic 
function for our case, p — 1/2, is given by the Fourier transform 


(e itY ) = —/ ' e - y( ^- it)dl L = 


—r 

1/27 x Jo 


x . 

= (1 - 2ifa 2 )~ 1/2 . 


(rV2jr(^i — it) 


1/2 


f 


, da: 


Since the x 2 sample function contains a sum of squares, we need an addi¬ 
tion theorem for the gamma distributions: If the independent random vari¬ 
ables Y\ and Y 2 have a gamma distribution with p = 1/2, then Y\ + F 2 has a 
gamma distribution with p = 1. Since Y\ and Y-i are independent, the product 
of their densities [Eq. (17.36)] generates the characteristic function 


(gidh+h,)} = UY le itY 2 j = = Q _ 2ita 2 y 1 . 


(17.69) 


Now we come to the second step. We assess the quality of the fit by the random 
variable Y = YTj=\ % where n — N — r is the number of degrees of freedom 
for N data points and r fitted parameters. The independent random variables 
Xj are taken to be normally distributed with the (sample) variance <j 2 . (In our 
case, r = 1 and a = 1.) The x 2 analysis does not really test the assumptions 
of normality and independence, but if these are not approximately valid, there 
will be many outlier points in the fit. The addition theorem gives the probability 
density (Fig. 17.11) for Y, 


9n(y) = 


--1 

IQ 


2»/ 2 <7 ? T (!) 


e-y' 2 °\ 


y> 0, 


and g n (y) = 0 if y < 0, which is the x 2 distribution corresponding to n degrees 
of freedom. Its characteristic function is 


(e itY ) = (1 - 2 ita 2 y n/2 . 
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Table 17.2 
X 2 Distribution" 


EXAMPLE 17.6.1 


n 

v = 0.8 

v = 0.7 

v = 0.5 

v = 0.3 

v = 0.2 

v = 0.1 

i 

0.064 

0.148 

0.455 

1.074 

1.642 

2.706 

2 

0.446 

0.713 

1.386 

2.408 

3.219 

4.605 

3 

1.005 

1.424 

2.366 

3.665 

4.642 

6.251 

4 

1.649 

2.195 

3.357 

4.878 

5.989 

7.779 

5 

2.343 

3.000 

4.351 

6.064 

7.289 

9.236 

6 

3.070 

3.828 

5.348 

7.231 

8.558 

10.645 

a Entries are 

X v for the probabilities v 

= P(X 2 > X 2 ) = 

2“/ 2 r(m/2) 

f$e-« l2 yW 2 > 

l dy for <7 = 1. 


Figure 17.11 

X 2 Probability 
Density g n (y) 


Quiy) 



Differentiating and setting 1 = 0, we obtain its mean value and variance 

( Y) = no 2 , o 2 (Y) = 2 no 4 . (17.70) 


Tables give values for the / 2 probability for n degrees of freedom, 


W > 2/o) = 


i r°° »_, _ v 

2*VT© L yr 


for cr = l and yo > 0. To use Table 17.2 for o ^ 1, rescale ]jo = vqo 2 so that 
P(X 2 > Doer 2 ) corresponds to P(x 2 > i>o) of Table 17.2. The following example 
illustrates the whole process. 


Let us apply the / 2 function to the tit in Fig. 17.8. The measured points (t n y :) ± 
A yj) with errors A yj are 


(1,0.8 ±0.1), (2, 1.5 ± 0.05), (3, 3 ±0.2). 
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For comparison, the maximum likelihood fit [Eq. (17.65)] gives 


1 -0.8 + 2- 1.5 + 3-3 
1 + 4 + 9 


12.8 

~\A 


= 0.914. 


Minimizing instead 


X 


2 



y.i - atj 

A %- 


2 


gives 


or 


In our case, 


3T) = _ 2 y- ~ a W 

da (A yjf 


a = 


^ hyj 
-3 (AJ/p 


E,- 


3 (A2/p 


2 


I 


1-0.8 , 2-1.5 , 3-3 

OTP" OOgr 02? 

I 2 , 2 2 , 3 2 

oTF ' Foe? + 02 s 


1505 

1925 


0.782 


is dominated by the middle point with the smallest error Ay-z = 0.05. The 
error propagation formula [Eq. (17.62)] gives us the variance a 2 of the estimate 
of a, 


using 



1 

Aj ( A ;®) 2 


da 
3 Vj 


(A y]f 


E 


k (A y k ) 


2 


For our case, o a = 1 /V1925 = 0.023; that is, our slope parameter is a = 
0.782 ± 0.023. 

To estimate the quality of this fit of a, we compute the x 2 probability that the 
two independent (control) points miss the fit by two standard deviations; that 
is, on average each point misses by one standard deviation. We apply the x 2 
distribution to the fit involving N = 3 data points and r = 1 parameter; that is, 
for n— 3—1 = 2 degrees of freedom. From Eq. (17.70), the x 2 distribution has 
a mean value 2 and a variance 4. A rule of thumb is that x 2 ~ n for a reasonably 
good fit. Then P(y 2 > 2) ~ 0.496 is read off Table 17.2, where we interpolate 
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between P(/ 2 > 1.386 2 ) = 0.50 and P( X 2 > 2.408 2 ) = 0.30 as follows: 


-P(x 2 > 2) = P(x 2 > 1.386 2 ) — 


2 - 1.386 2 
2.408 2 - 1.386 2 


x [P(x 2 > 1.386 2 ) - P(x 2 > 2.408 2 )] 


= 0.5 - 0.02 • 0.2 = 0.496. 


Thus, the x 2 probability that, on average, each point misses by one standard 
deviation is approximately 50% and fairly large. ■ 


Our next goal is to compute a confidence interval for the slope parameter 
of our fit. A confidence interval for an a priori unknown parameter of some 
distribution (e.g., a determined by our fit) is an interval that contains a not 
with certainty but with a high probability p, the confidence level, which we 
can choose. Such an interval is computed for a given sample. Such an analysis 
involves the Student t distribution. 


The Student t Distribution 


Because we always compute the arithmetic mean of measured points, we now 
consider the sample function 



where the random variables Xj are assumed independent with a normal dis¬ 
tribution of the same mean value m and variance a 2 . The addition theorem for 
the Gauss distribution tells us that X\ + ■ ■ ■ + X n has the mean value nm and 
variance no 2 . Therefore, ( X\ + • • • + X„)/n is normal with mean value m and 
variance ner 2 /re 2 = o 2 /n. The probability density of the variable X — to is the 
Gauss distribution 


f(x — m) 


«Jn 

Oy/2jT 


exp 


n(x — m) 2 \ 

2^ J ' 


(17.71) 


The key problem solved by the Student t distribution is to provide estimates 
for the mean value to, when a is not known, in terms of a sample function 
whose distribution is independent of o. To this end, we define a rescaled 
sample function /: 


t = s 2 = i jr^Xj - X) 2 - (17.72) 

j= i 


It can be shown that t and S are independent random variables. Following 
the arguments leading to the x 2 distribution, the density of the denominator 
variable S is given by the gamma distribution 


d(s) = 


n~ s m- "e 2 ^ 
2T r (tl) ff »- 1 ' 


(17.73) 
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The probability for the ratio Z = X/Y of two independent random variables 
X, Y with normal density for /, and d is given by Eq. (17.73), is [Eq. (17.40)] 


/ Z /»oo 

/ f(yz)d(jj)\y\dydz 

-oo J — oo 


(17.74) 


so that the variable V = (X — m)/S has the density 

~Jn 


r(v) = 


-r 

J o 


G 


•JZjt 


exp 


9 9 \ " 1 n_o — ^ 

nv s" \ n 2 s e 


2a 2 


2T r (5=i)a™- 1 

r°° .njV+l) . 

— e s n x ds. 

I) Jo 


n 


,n/2 


a«V^2"?r(^) Jo 
Here, we substitute z= s 2 and obtain 

ri 1 / 2 


-.s'd.s 


r(v) = 


rj; 


_ nz(t> 2 -|-l) n—2 
0 2a 2 £ 2 ^2. 


ff"V^2 re/2 r ( ? ^i) 

Now we substitute T(l/2) = Jtc, define the parameter 

n(y 2 + 1) 


a = 


2 a 2 


and transform the integral into V (n/Z) / a n Z to find 

T (n/Z) 


r(v) = 


(5=i) (r 2 + 1 r 12 ’ 


—oo < v < oo. 


Finally, we rescale this expression to the variable t in Eq. (17.72) with the 
density (Fig. 17.12) 


9(f) = 


E(n/ 2) 


t 2 \n/2 ’ 


Vf(n^T)T (^)(1+X_) 


—oo < t < oo 


(17.75) 


Figure 17.12 
Student f 

Probability Density 
for n — 3 
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Table 17.3 

Student t Distribution" 


p 

n - 1 

n= 2 

n = 3 

71 = 4 

n = 5 

0.8 

1.38 

1.06 

0.98 

0.94 

0.92 

0.9 

3.08 

1.89 

1.64 

1.53 

1.48 

0.96 

6.31 

2.92 

2.36 

2.13 

2.02 

0.975 

12.7 

4.30 

3.18 

2.78 

2.57 

0.99 

31.8 

6.96 

4.54 

3.75 

3.36 

0.999 

318.3 

22.3 

10.2 

7.17 

5.89 


" Entries are the values C in P(C) = K n ^ ) ^ 7!+1 ^/ 2 df = p, ra is the 

number of degrees of freedom. 


for the Student t distribution, which manifestly does not depend on m or a. 
The probability for t\ < t < t% is given by the integral 


P(t\, k) = 


T(re/2) r h dt 

(i+^r /2> 


(17.76) 


and P( 2 ) = P(— oo, 2 ) is tabulated. (See Table 17.3.) Also, P(oo, —oo) = 1 and 
P{-z) = 1 — P(z) because the integrand in Eq (17.76) is even in t, so 



dt 

(i+^r 



dt 

(i+^r /2 


and 


r 


dt 


(i+^rf 2 ■'-(i+^r 


/. 


dt 


t 2 sn /2 




dt 


t 2 \»/2 ' 


Multiplying this by the factor preceding the integral in Eq. (17.76) yields 
P(—z ) = 1 — P(s). In the following example, we show how to apply the Student 
t distribution to our fit of Example 17.6.1. 


EXAMPLE 17.6.2 


Confidence Interval Here, we want to determine a confidence interval for 
the slope a in the linear v/ = at fit of Fig. 17.8. We assume 


• first that the sample points (tj, yj) are random and independent; and 

• second that, for each fixed value t, the random variable Y is normal with 
mean y(t) = at and variance a 2 independent of t. 


These values yj are measurements of the random variable Y, but we will regard 
them as single measurements of the independent random variables Yj with the 
same normal distribution as Y (whose variance we do not know). 
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We choose a confidence level, for example, p = 95%. Then the Student 
probability is 

P{-C, C) = P(C) - P(-C) = p = -1 + 2P(C), hence P(C) = J(1 + p), 

using P(—C) = 1 — P(C), and 

1 f C / /2 \ -(M+ 1 )/ 2 

P(C) = -(1 + p) = 0.975 = K n J f 1 + - J dt, 

where /£„_i is the factor preceding the integral in Eq. (17.76). Now we de¬ 
termine a solution C = 4.3 from Table 17.3 of Student’s t distribution with 
n — N — r = 3 — 1 = 2 the number of degrees of freedom, noting that (1 + p)/2 
corresponds to p in Table 17.3. 

Then we compute A = Co a /^fN for sample size N = 3. The confidence 
interval is given by 

a — A< a < a + A, at p = 95% confidence level. 

From the y 1 analysis of Example 17.6.1, we use the slope a = 0.782 and 
variance er 2 = 0.023 2 so that A = 4.3°^ 2:l = 0.057, and the confidence inter¬ 
val is determined by a — A — 0.782 — 0.057 = 0.725, a + A = 0.839, or 

0.725 < a < 0.839 at 95% confidence level. 

Compared to o a , the uncertainty of a has increased due to the high confidence 
level. Table 17.3 shows that a decrease in confidence level, p, reduces the 
uncertainty interval, and increasing the number of degrees of freedom, n, 
would also lower the range of uncertainty. ■ 


EXERCISES 

17 . 6.1 Let A A be the error of a measurement of A, etc. Use error propagation 
to show that (^P) 2 = (^^) 2 + (^p) 2 holds for the product C — AB 
and the ratio C — A/B. 

17 . 6.2 Find the mean value and standard deviation of the sample of measure¬ 
ments X\ = 6.0, X 2 = 6.5, X 3 = 5.9, x& = 6.2. If the point xq = 6.1 is 
added to the sample, how does the change affect the mean value and 
standard deviation? 

17 . 6.3 (a) Carry out a x 2 analysis of the fit of case b in Fig. 17.9 assuming the 
same errors for the t-i, At, = Ap, ; , as for the ?/,■ used in the x 2 analysis 
of the fit in Fig. 17.8. 

(b) Determine the confidence interval at the 95% confidence level. 

17 . 6.4 Ifxi, X 2 , ■ ■ ■ , x n are a sample of measurements with its mean value given 
by the arithmetic mean x, and the corresponding random variables 
Xj that take the values Xj with the same probability are independent 
and have mean value /i and variance a 2 , then show that (x) = p and 
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o 2 (x) = a 1 /n. If a 
Id 2 ) = ^ct 2 . 


-2 


i J 2 j( x j ~ x f is the sample variance, show that 
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Calculus of Variations 


Uses of the Calculus of Variations 

In this chapter, we address a different class of minimum/maximum problems 
where we search for an appropriate function or curve, rather than a value 
of some variable, that makes a given quantity stationary, usually an energy 
or action integral. Because a function is varied, these problems are called 
variational. Variational principles, such as D’Alembert’s or Hamilton’s, have 
been developed in classical mechanics, and Lagrangian techniques occur in 
quantum mechanics and field theory, but they also appear in electrodynamics 
(e.g., Fermat’s principle of the shortest optical path). Before plunging into this 
different branch of mathematical physics, let us summarize some of its uses 
in both physics and mathematics. 

1. In existing physical theories: 

a. Unification of diverse areas of physics using energy and action as key 
concepts. 

b. Convenience in analysis—Lagrange equations Section 18.2. 

c. Elegant treatment of constraints, Section 18.5. 

2. Starting point for new, complex areas of physics and engineering: In general 
relativity the geodesic is taken as the minimum path of a light pulse or the 
free fall path of a particle in curved Riemannian space. Variational principles 
appear in quantum field theory and have been applied extensively in control 
theory. 

3. Mathematical unification: Variational analysis provides a proof of the com¬ 
pleteness of the Sturm-Liouville eigenfunctions (Chapter 9), and establishes 
a lower bound for the eigenvalues. 

4. Calculation techniques (Section 18.6): Calculation of the eigenfunctions and 
eigenvalues of the Sturm-Liouville equation. 
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0 



P Concept of Variation 


The calculus of variations involves problems in which the quantity to be mini¬ 
mized (or maximized) appears as a stationary integral, a functional because a 
function y(pc) needs to be determined. As the simplest case, let 



(18.1) 


where J is the quantity that takes on a stationary value. Under the integral 
sign, / is a known function of the indicated variables y(x), dy(pc)/dx, x but 
the dependence of y on x is not yet known; that is, y(x) is unknown. This 
means that although the integral is from xj to x%, the exact path of integration 
is not known (Fig. 18.1). We are to choose the path of integration through 
points {xi, y{) and (x- 2 , y>) to minimize J. Strictly speaking, we determine 
stationary values of J: minima, maxima, or saddle points. In most cases of 
physical interest the stationary value will be a minimum. 

This problem is considerably more difficult than the corresponding prob¬ 
lem of a function y(x) in differential calculus. Indeed, there may be no solution. 
In differential calculus, the minimum is determined by comparing y(pc o) with 
y(x), where x ranges over neighboring points. Here, we assume the existence 
of an optimum path—that is, an acceptable path for which J is stationary— 
and then compare J for our (unknown) optimum path with that obtained from 
neighboring paths. In Fig. 18.1 two possible paths are shown. (There are an 
infinitely many.) The difference between these two for a given x is called the 
variation of y, Sy, and is conveniently described by introducing a new function 

Figure 18.1 

A Varied Path '! 



* 2 . 2/2 


*1.2/1 


X 
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i](x) to define the arbitrary deformation of the path and a scale factor a to give 
the magnitude of the variation. The function )] (x) is arbitrary except for two 
restrictions. First, 

r](x i) = = 0 , (18.2) 

which means that all varied paths must pass through the fixed end points. 
Second, as will be seen soon, rj(x) must be differentiable; that is, we may not 
use 


r](x) = 1, x = Xo, 

= 0, x ^ xo, (18.3) 

but we can choose i] (x) to have a form similar to the functions used to repre¬ 
sent the Dirac delta function (Chapter 1) so that ;; (x) differs from zero only 
over an infinitesimal region . 1 Then, with a path described by a and r](x), 

y(x, a) = y(x, 0) + ar](x ) (18.4) 


and 


Sy = y(x, a) — y(x, 0) = ar^Qxf). 


(18.5) 


Let us choose y(x, 0) as the unknown path that will minimize or maximize 
J. Then y(x, a ) with a 0 describes a neighboring path. In Eq. (18.1), J is 
now a function 2 of our parameter a [and of the 77 (a;)]: 


J(u) = / f[y(pc, a ), y x (x, a), x]dx, 

J X\ 

with y x = dy/dx, and our condition for an extreme value is that 

3 J(a) 


da 


= 0 , 


(18.6) 


(18.7) 


- a=0 


analogous to the vanishing of the derivative dy/dx in differential calculus. 

Now the a-dependence of the integral is contained in y(x, a ) and y x (x, a ) = 
(3/3 x)y(x, a). Therefore , 3 


dJ(a) 

da 


rx 2 

J X 1 


9/ dy. 

_ 3 y da 3 y x da _ 


dx. 


From Eq. (18.4), 


dy(pc, a) 


= d(x) 


da 

dy x (x, a ) dy(x) 


da 


dx 


(18.8) 

(18.9) 

(18.10) 


1 Compare H. Jeffreys and B. S. Jeffreys, Methods of Mathematical Physics , 3rd ed. Cambridge 
Univ. Press, Cambridge, UK (1966), Chapter 10, for a more complete discussion of this point. 

2 Technically, J is a functional, depending on the functions y[x, o') and dy/dx = y x (x, a): 
J[y(x, a), y x (x, or)]. 

3 Note that y and y x are being treated as Independent variables. 
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so that Eq. (18.8) becomes 


3 J(a) 
da 


r fdf , , , 3/ dr,(_x)\ ^ 

= / (x) + - --— dx. 

J X1 \3 y 9 y x dx ) 


(18.11) 


Integrating the second term by parts to get q(x) as a common and arbitrary 
nonvanishing factor, we obtain 


/ 
J x 


X2 dq(x) df df 

-—— dx= qQx)— 
x\ dx 9y x 9y x 


-j 

J X] 


x2 , , d df 

dx - 

dx 9y x 


(18.12) 


The integrated part vanishes by Eq. (18.2) and, setting dJ(a)/du = 0 at a = 0, 
Eq. (18.11) leads to 


P\ d l _ 
J Xl 19y 


q(x) dx — 0. 


d L df^ 

)xi l u t> dx 9y x 

Occasionally, Eq. (18.13) is multiplied by a, which gives 

rx 2 


/ 

J X 1 


df d df \ 

-) Sydx = 8 J = 0. 

By dxdy x ) 


(18.13) 


(18.14) 


Since q (x) is arbitrary, we may choose it to have the same sign as the remainder 
of the integrand of Eq. (18.13), whenever the latter differs from zero. Hence, 
the integrand is always nonnegative. Equation (18.13), our condition for the 
existence of a stationary value, can then be satisfied only if 4 * 


9f_±9£ 

dy dx dy x 


(18.15) 


This partial differential equation (PDE) is known as the Euler equation and can 
be expressed in various other forms. Sometimes, solutions are missed when 
they are not twice differentiable as required by Eq. (18.15). It is clear that Eq. 
(18.15) must be satisfied for J to take on a stationary value, that is, for Eq. 
(18.14) to be satisfied. Equation (18.15) is necessary, but it is by no means 
sufficient. 6 Courant and Robbins illustrate this very well by considering the 
distance over a sphere between points on the sphere, A and B (Fig. 18.2). Path 1, 
a great circle, is found from Eq. (18.15). However, path 2, the remainder of the 
great circle through points A and B, also satisfies the Euler equation. Path 2 is 
a maximum but only if we demand that it be a great circle and then only if we 
make less than one circuit; that is, path 2 plus n complete revolutions is also a 
solution. If the path is not required to be a great circle, any deviation from path 
2 will increase the length. This is hardly the property of a local maximum, and 


4 It is important to watch the meaning of d/dx and d/dx closely. For example, if / = f[y(x), y x , x], 

df _ 3f_ 9/ d 2 y 

dx dx dy dx dy x dx 2 ’ 

The first term on the right gives the explicit x-dependence. The second and third terms give the 
implicit x-dependence via y and y x . 

6 For a discussion of sufficiency conditions and the development of the calculus of variations as 
a part of mathematics, see G. M. Ewing, Calculus of Variations with Applications [Norton, New 
York (1969)]. Sufficiency conditions are also covered by Sagan (see Additional Reading). 
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EXAMPLE 18.1.1 


Figure 18.2 

Stationary Paths 
over a Sphere 



that is why it is important to check the properties of solutions of Eq. (18.15) 
to see if they satisfy the physical conditions of the given problem. 


Straight Line Perhaps the simplest application of the Euler equation is in 
the determination of the shortest distance between two points in the Euclidean 
,r,//-plane in terms of a straight line. Because the element of distance is 

ds = [( dx) 2 + (dyf] 1/2 — [l + 2 /J] 1/2 dx, (18.16) 

the distance, J, may be written as 

r fe,2/2) rx 2 

J= ds = \ ll + ylY' dx. (18.17) 

Jxi 

Comparison with Eq. (18.1) shows that 

f(.y, Vx, x) = (1 + //f.) 1/2 (18.18) 


depends only on y x = 3 y/dx but not on y or x. Substituting into Eq. (18.15) 
and using df/dy= 0, we obtain 

= 0 , 

dx dy x 


which can be integrated to yield 

9 / Vx r 

dyx (1 + 2/J) 1/2 

with C a constant. This can only be satisfied by 

y x = a, with a a constant, 


(18.19) 


(18.20) 


which has solution 


y—ax + b. 


(18.21) 
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the familiar equation for a straight line. The integration constants a and b are 
now chosen so that the line passes through the two points (x\, y{) and (x-i, Vi)- 
Hence, the Euler equation predicts that the shortest 6 distance between two 
fixed points in Euclidean space is a straight line. 

The generalization of this example in curved four-dimensional space-time 
leads to the important concept of the geodesic in general relativity and its 
differential equation (see Chapter 2). ■ 


EXAMPLE 18.1.2 


Optical Path Near Event Horizon of a Black Hole Determine the optical 
path in an atmosphere in which the velocity of light increases in proportion to 
the height, v(y) = y/b, with b > 0 some parameter describing the light speed. 
Therefore, v = 0 at y = 0, which simulates the conditions at the surface of a 
black hole, called its event horizon, where the gravitational force is so strong 
that the velocity of light goes to zero, thus even trapping light. 

Because light takes the shortest time, the variational problem takes the 
form 


r t2 [ S2 ds f t2 Jdx 2 + dy 2 

At = I dt = I — — b -- —dt — minimum, 

Jt ! Jsi v Jti y 

where v = ds/dt = y/b is the velocity of light in this environment, with the 
y coordinate being the height. A look at the variational functional suggests 
choosing y as the independent variable because x does not appear in the 
integrand. We can bring dy outside the radical and change the role of x and y 
in J of Eq. (18.1) and the resulting Euler equation. With x = x(y), x' = dx/dy, 
we obtain 


'/» 


yi y/x' 2 + 1 


b / - dy = minimum 

4 v 


and the Euler equation becomes 


df_±K =0 

dx dy dx' 


(18.22) 


Since df/dx — 0, this can be integrated, giving 


—— Ci = const, or x' 2 = C\y 2 {x' 2 + 1). 
y~Jx' 2 + 1 

Separating dx and dy in this first-order ordinary differential equation (ODE), 
we find the integral 



Ciydy 

v/i -cfy 2 ’ 


technically, we have a stationary value. From the a 2 terms it can be identified as a minimum 
(Exercise 18.1.5). 
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Figure 18.3 

Circular Optical 
Path in Medium 



which yields 

%+ C 2 = 7 ^v/l - C\y 2 or (x + C 2 f + y 2 = . 

W Oj 

This is a circular light path with center on the x-axis along the event horizon 
(Fig. 18.3). This example may be adapted to a mirage (Fata Morgana) in a 
desert with hot air near the ground and cooler air aloft (the index of refraction 
changes with height in cool versus hot air), thus changing the velocity law from 
v — y/b —► no — y/b. In this case, the circular light path is no longer convex 
with center on the x-axis but rather becomes concave. ■ 


EXAMPLE 18.1.3 


Soap Film As a second illustration (Fig. 18.4), consider two parallel coaxial 
wire circles to be connected by a surface of minimum area that is generated 
by revolving a curve y(x) about the x-axis. Physically, this will minimize the 
energy due to surface tension. The curve is required to pass through fixed end 
points (xi, y{) and (X 2 , 2 / 2 )- The variational problem is to choose the curve ?/(x) 
so that the area of the resulting surface will be a minimum. 

For the element of area shown in Fig. 18.4, 


dA = 2 Ttyds = 2 ny/dx 2 + dy 2 ) 1/2 , (18.23) 


where we keep x as the independent variable and bring dx outside the radical. 
The variational functional is then 

A=J = J 27r?/(l + y 2 ) 1/2 dx, y x = 

Neglecting the 27T, we obtain 

f(y, y x , x) = y( 1 + y 2 ) 1/2 . 

Since 3//3x = 0, instead of choosing y as an independent variable as in Exam¬ 
ple 18.1.2, which would be inconvenient here, we apply the following variant 
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Figure 18.4 

Surface of 
Rotation-Soap Film 
Problem 



of the Euler Eq. (18.15): 


df d 


dx dx 


df 


--- f-y x — = °- 


3 2/a 


(18.24) 


To prove this, we first carry out the differentiation explicitly: 


d ( , 3 / \ 3 / 3 / 3 / 3 / d 3 / 

-I J — y x - — - - 1 - W/v.- —I— W/v/v.- — ?7/v/v- — ?7/v- 


dx\ 


3 2/a 


yx 

3a; 3?/ 


y xx 


df 

dx 


Vx 


3/ 

32/ 


3 2/* 
d_^/ 
da; 3t/ x 


yxx 


dy x 
dy dx' 


y x 


dx dy x 


Bringing f x to the left-hand side, we obtain 


3/ df 3 (df d df 

//. — 1 = Vr\ -. — 


dx dx \ 


dy x 


y x 


dy dx dy x 


thus verifying the equivalence of the Euler variant [Eq. (18.24)] and Eq. (18.15) 
for y x ^ 0. Then, in the last equality, we enforce the equivalence of the Euler 
Eqs. (18.22) and (18.15), which can now be written as 


df 

dx 


d df 


dy dx ' 


, — y x 


df 

dy 


fLK 

dx dy x 


For y x ^ 0 the equivalence of both Euler equations is manifest. From this form 
[Eq. (18.24)], we get, with = 0, 


±(, df\ 
dx \ dx) 


= 0, 
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which we integrate to yield 


9/ 


In our case, this is 


/ - Vxt- = ci = const. 
ax 


,, 2\l/2 

2/(1 + y x ) -y% 


(1 + 2/J) 


1/2 


= Cl 


or 


(! + Vx) 


1/2 


= Cl- 


Squaring, we get 


V 


1 + 2/1 ^ 

Since y\ > 0, Eq. (18.26) implies that //" > cf so that 

. da; Ci 

(2/a-) = , = 


dy 


y 2 -c\ 


This may be integrated to give 


X — Cl cosh 1 -b C2. 

Cl 


Solving for y, we have 


y — Ci cosh 


X-Cl 
Cl 


(18.25) 


(18.26) 


(18.27) 


(18.28) 


(18.29) 


(18.30) 


The constants Ci and C 2 are determined by requiring this curve to pass through 
the points (pc\, ]j\) and (X 2 , 2 / 2 )- Our “minimum” area surface is a special case 
of a catenary of revolution or a catenoid. 

For an excellent discussion of both the mathematical problems and exper¬ 
iments with soap films, we refer to Courant and Robbins in Additional 
Reading. ■ 


Biographical Data 

d’Alembert, Jean Le Rond. d’Alembert, a French physicist and mathe¬ 
matician, was born in 1717 in Paris and died in 1783 in Paris. Illegitimate 
son of a Paris salon hostess and a cavalry officer, he was abandoned at the 
church of St. Jean-le-Rond (whence his name) and raised by a foster family, 
his education being supported by his father. Besides the famous variational 
principle of analytical mechanics named after him, he developed the preces¬ 
sion of the equinoxes in celestial mechanics, fluid dynamics, and vibrating 
strings, introducing the separation of variables. 
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EXERCISES 


18 . 1.1 Derive Euler’s equation by expanding the integrand of 


J(«) = f[y(x, a), y x (x, o;), x\dx 


in powers of a using a Taylor (Maclaurin) expansion with y and y x as 
the two variables (Section 5.6). 

Note. The stationary condition is dJ(a)/da = 0, evaluated at a = 0. 
The terms quadratic in a may be useful in establishing the nature of 
the stationary solution (maximum, minimum, or saddle point). 

18 . 1.2 Find the Euler equation corresponding to Eq. (18.15) if f — 


f(y X x, y x , y , x). 



ANS. 


r](x 0 = J](X 2 ) = 0, r] x (x i) = r] x {pc 2 ) = 0. 

18 . 1.3 The integrand f{jj , y x , x) of Eq. (18.1) has the form 
f(y, y x , x) = f\(x, y) + f 2 (x, y)y x . 

(a) Show that the Euler equation leads to 


dA_ dj2 
3 y dx 


(b) What does this imply for the dependence of the integral J upon 
the choice of path? 

18 . 1.4 Show that the condition that 



has a stationary value 

(a) leads to /( x , y) independent of y and 

(b) yields no information about any ^-dependence. 

We get no (continuous, differentiable) solution. To be a meaning¬ 
ful variational problem, dependence on y or higher derivatives is 
essential. 

Note. The situation will change when constraints are introduced. 

18 . 1.5 In Example 18.1.1, expand J[y(x,a )] — J[y(x, 0)] in powers of a. 
The term linear in a leads to the Euler equation and to the straight- 
line solution [Eq. (18.21)]. Investigate the a 2 term and show that the 
stationary value of J, the straight-line distance, is a minimum. 

18 . 1.6 (a) Show that the integral 


J = f(V, y x , x) dx, with / = y(x), 



has no extreme values. 

(b) If f(y, y x , x) = y 2 (x), find a discontinuous solution. 
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18.1.7 Fermat’s principle of optics states that a light ray will follow the path 
y(x), for which 


roc 2,2/2 

/ 


n(y, x)ds 


is a minimum when n is the index of refraction. For t /2 = t/i = 1, —X\ = 

X ‘2 — 1, find the ray path if 

(a) n = e y , (b) n = aQy - y 0 ), y > y 0 . 


18.1.8 A frictionless particle moves from point A on the surface of the earth 
to point B by sliding through a tunnel under the gravitational force. 
Find the differential equation to be satisfied if the transit time is to be 
a minimum. 

Note. Assume the earth to be a nonrotating sphere of uniform density 
and the depth of the tunnel to be small compared to the radius of the 
earth. 


ANS. 


[Eq. (18.15)] r w (r 6 - ra ■ ) + r“(2cr - r 2 ) + err- = 0, 
r((p = 0) = r 0 , rJ<p = 0) = 0, r(<p = (p A ) = a, 


Kv = <p B ) = a. Or: r z = 


a : r 2 


The solution of these equations is a hypocycloid, generated by a circle 
of radius |(a — ro) rolling inside the circle of radius a. You might want 
to show that the transit time is 

(a 2 — rlf ! 2 

1 n (afl0 1/2 

For details, see P. W. Cooper, Am. J. Phys. 34, 68 (1966); G. Veneziano 
et al., Am. J. Phys. 34, 701-704 (1966). 

18.1.9 A ray of light follows a straight-line path in a first homogeneous 
medium, is refracted at an interface, and then follows a new straight- 
line path in the second medium. Use Fermat’s principle of optics to 
derive Snell’s law of refraction: 


n i sin di = n ,2 sin0 2 - 

Hint. Keep the points (x\ , y\) and (x-i, y>) fixed and vary xq to satisfy 
Fermat (Fig. 18.5). This is not a Euler equation problem. (The light 
path is not differentiable at .x'o.) 

18.1.10 Find the curve of quickest descent from (0, 0) to (xq, yo) for a parti¬ 
cle sliding under gravity and without friction. Show that the ratio of 
times taken by the particle along a straight line joining the two points 
compared to along the curve of quickest descent is (1 + 4/7r 2 ) 1/2 . 
Hint. Take y to increase downwards. Obtain y 2 = (1 — c 2 y )/c 2 y, 
where c is an integration constant. Then make the substitution 
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Figure 18.5 
Snell’s Law 



y = (sin 2 (p/2)/c- to parameterize the cycloid and take (xq, v/o) = 
(tt/ 2c 2 , 1/c 2 ). 

18 . 1.11 What is the shortest distance between two points on a cylinder? 


18.2 Several Dependent Variables 


Our original variational problem, Eq. (18.1), may be generalized in several re¬ 
spects. In this section, we consider the integrand, /, to be a function of several 
dependent variables, yi(x), y-iQx), y.Xx), ..., all of which depend on x, the 
independent variable. In Section 18.3, / again will contain only one unknown 
function y, but y will be a function of several independent variables (over which 
we integrate). In Section 18.4, these two generalizations are combined. Finally, 
in Section 18.5 the stationary value is restricted by one or more constraints. 

For more than one dependent variable, Eq. (18.1) becomes 

roc 2 

J = / f[y\(x), V 2 (x ),..., i/,,0*o, y\x(x), y^xipc),y nx (x), x]dx. (18.31) 

J X 1 

As in Section 18.1, we determine the extreme value of J by comparing 
neighboring paths. Let 


yi(x, of) = i/jOr, 0) + aiyXx), i = 1, 2, ..., n, (18.32) 

with the rii independent of one another but subject to the restrictions discussed 
in Section 18.1. By differentiating Eq. (18.31) with respect to a and setting 
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a — 0, since Eq. (18.6) still applies, we obtain 



dx = 0, 


(18.33) 


the subscript x denoting partial differentiation with respect to x; that is, y tx = 
dyi/dx, and so on. Again, each of the terms {df/dy ix )r]i x is integrated by parts. 
The integrated part vanishes and Eq. (18.33) becomes 



d df 
dx d y ix 


rji dx — 0. 


(18.34) 


Since the iy are arbitrary and independent of one another,' each of the terms 
in the sum must vanish independently. We have 


df d df 

dyi dx d{dy i /dx') 


i= 1.2, 


(18.35) 


a whole set of Euler equations, each of which must be satisfied for an extreme 
value. 


EXAMPLE 18.2.1 


Missing Dependent Variables Consider the variational problem /f (r)dt — 
minimum. Here, r = dr/dt and r is absent from the integrand. Therefore, the 
Euler equations become 


d df d df n d df 

-— = 0 , -— = 0 , -— = 0 , 

dt dx dt dy dt dz 

with r = (,r, y, z) so that f r = ('f, |£, |C) = c = const. Solving these three 
equations for the three unknowns x, y, z yields r = ci = const. Integrating this 
constant velocity gives r = c\t + c- 2 - The solutions are straight lines despite 
the general nature of the function /. 

A physical example illustrating this case is the propagation of light in a 
crystal, where the velocity of light depends on the (crystal) directions but not 
on the location in the crystal because a crystal is an anisotropic homogeneous 
medium. The variational problem 



minimum 


has the form of our example. Note that t, need not be the time, but it parame¬ 
terizes the light path. ■ 


7 For example, we could set rjz = ^3 = ry ■ ■ ■ = 0, eliminating all but one term of the sum, and then 
treat fji exactly as in Section 18.1. 
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Figure 18.6 


Transversality 

Condition 


y 



l 


X 


Transversality Condition 


So far in our variational problems we have held the end points of the varied 
curves fixed (see Fig. 18.1). Now we relax this condition and let one end 
point move on a given path for the following reason: We anticipate using the 
variational calculus to deal with problems such as Examples 1.2.2 and 1.3.1, 
where we seek the shortest distance from an observer to a rocket in flight 
or between two rockets in free flight. We shall find and confirm the result of 
these examples that the line of shortest distance is perpendicular to the rocket 
path, which is called transversality condition. It may be used to determine the 
location of the end point. 

Let r = r(r) be the rocket path that, in general, will not be a straight line, 
and ro the position of the observer. Let r = ri(£, r) be a family of curves from 
any point r on the rocket path to the observer covering a surface area so that 
n(£ = 0, r) = ro (for all r values) is the observer’s location and ri (t = 1, r) = 
r(r), its intersection with the rocket path (Fig. 18.6). Let r = r 0 denote the 
curve of shortest distance (the extremal solution) so that ri(£ = 1, ro) = r s 
is the intersection of the shortest distance curve with the rocket path. Now 
we vary the intersection point r s so that r\(t, r 0 ) becomes ri(£, r 0 ) + aT(f) for 
small enough a. Here, T = 3ri/3r, and for t = 1, r = to, it is the tangent vector 
dr/dr = T 0 of the rocket path at r s . At the observer’s point we have T —> 0 
because this point is fixed. Now we substitute this curve into the variational 
integral 



</(«)= / /Oi + aT, i-i + aT)dt 


and set |„ =0 = 0 (see Eq. 18.7), expand to first order in a, and integrate by 
parts as usual to find 
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where f T = (df/dx, df/dy, df/dz) and f r = (3 f/dx, df/dy , df/dz). Here, the 
integral vanishes because for a = 0 we have r = to, and the shortest distance 
curve satisfies Euler’s equation. Thus, the term from the integration by parts 
must vanish as well. Since at t = 0 (the observer’s location) T = 0, and at 
t = 1 (the intersection with the rocket path) T = T 0 is the tangent vector of 
the rocket path, we end up with the transversality condition 

fr\r s ■ To = 0 = fi\ Is ■ 

dr 

When we seek the shortest distance between two curves, the transversality 
condition holds at both ends because we may hold one end fixed while we 
determine one end point and then treat the second end point similarly. 


EXAMPLE 18.2.2 


Shortest Distance of an Observer from a Rocket Now we are ready to 
reconsider Example 1.2.2 using the variational calculus. The observer is at 
ro = (2, 1,3) and the rocket path is the line 


r(r) = r t + vr, r! = (1, 1, 1), v = (1, 2, 3). 


We want to minimize the distance 

s = j y/x 2 + y 2 dt = minimum, 

which we expect to be a straight line r = £d + ro from the observer at t — 0 
to the point r s on the path of the rocket. This is confirmed by applying Example 
18.2.1 to this problem because the variational integrand only depends on the 
slope of the line from the observer to the rocket path. With / = yfx 2 + y 2 , fr ~ 
r = (x, y) = d. The transversality condition becomes d ■ v = 0 because 
dr/dr = v. We denote the intersection of the shortest distance line with the 
rocket path by r s = vr s +ri, where r s will be determined from the transversality 
condition (r] — r 0 + vr s ) • v = 0. Thus, 

_ (r 0 - ri) ■ v (1, 0, 2) • (1, 2, 3) _ 7 _ 1 
Xs v 2 14 14 2’ 

in agreement with Example 1.2.2. This gives 

r s = n + vr s = (1, 1, 1) + i(l, 2, 3) = (J, 2, 0 


or 


r s - r 0 = 





The shortest distance is |r s — r 0 | = yj\ + 1 = V3/2. 
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The shortest distance between two rockets in free flight in Example 1.3.1 
can be treated similarly. Here, the two transversality conditions become 


0"io - r 20 ) • ti = 0 = (rio - r 20 ) • t 2 , 
with the tangent vectors 

t! = r 2 - n = (2 - 1, 3 - 1, 4 - 1) = (1, 2, 3), 
t 2 = r 4 - r 3 = (4 - 5, 1 - 2, 2 - 1) = (-1, -1, 1). 


Therefore, the direction of the shortest distance rio — r 2 o is 


ti x t 2 

|ti X t 2 | 


1 

V42 


(5, -4, 1). 


The shortest distance is obtained by projecting the distance between two 
points on the rocket paths, one on each path, onto n; that is, 


d = (r 3 — n) • n = 


V42 


(5 - 1, 2 - 


1,1-1)- (5, -4, 1) 


1 

V42 


(4, 1, 0) • (5, -4, 1) = 


16 

V42' 


Hamilton’s Principle 


The most important application of Eq. (18.31) occurs when the integrand f 
is taken to be a Lagrangian L. The Langrangian (for nonrelativistic systems; 
see Exercise 18.2.5 for a relativistic particle) is defined as the difference of 
kinetic and potential energies of a system: 


L=T -V. 


(18.36) 


Using time as an independent variable instead of x and Xj(t) as the dependent 
variables, 


x ■ 


t, Vi -> Xi(t), y ix -x ±i(ty, 


Xi(t ) is the location and = dXi/dt, the velocity of particle i as a function 
of time. The equation 8J = 0 is then a mathematical statement of Hamilton’s 
principle of classical mechanics, 

r ^2 

<5 1 L(pc i, Xo, ■ ■ ■, x n , ±i, ± 2 , ■ . ■, x n ; t~)dt — 0. (18.37) 

Jt i 


In words, Hamilton’s principle asserts that the motion of the system from 
time t\ to t ,2 is such that the time integral of the Lagrangian L, or action, 
has a stationary value. The resulting Euler equations are usually called the 
Lagrangian equations of motion, 

d d L 3 L 
dt d±i dXi 


(18.38) 
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the first term being the kinetic part and the second the generalized force com¬ 
ponent. These Lagrangian equations can be derived from Newton’s equations 
of motion, and Newton’s equations can be derived from Lagrange’s. The two 
sets of equations are equally “fundamental.” 

The Lagrangian formulation has advantages over the conventional New¬ 
tonian laws. Whereas Newton’s equations are vector equations, Lagrange’s 
equations involve only scalar quantities. The coordinates X\, X2 ,... need not 
be any standard set of coordinates or lengths. They can be selected to match 
the conditions of the physical problem. The Lagrange equations are invariant 
with respect to the choice of coordinate system. Newton’s equations (in com¬ 
ponent form) are not manifestly invariant. Exercise 2.5.6 shows what happens 
to F = ma resolved in spherical polar coordinates. 

Exploiting the concept of action, we may easily extend the Lagrangian 
formulation from mechanics to diverse fields, such as electrical networks and 
acoustical systems. Extensions to electromagnetism appear in the exercises. 
The result is a unity of otherwise separate areas of physics. In the development 
of new areas the quantization of Lagrangian particle mechanics provided a 
model for the quantization of electromagnetic fields and led to the gauge theory 
of quantum electrodynamics. 

One of the most valuable advantages of the Hamilton principle-Lagrange 
equation formulation is the ease in seeing a relation between a symmetry and 
a conservation law. For example, let x- L — q>, an azimuthal angle. If our La¬ 
grangian is independent of <p (i.e., cp is an ignorable coordinate), there are two 
consequences: (i) an axial (rotational) symmetry from which dL/dcp = 0 fol¬ 
lows, and (ii) from Eq. (18.38) dL/dip = constant. Physically, this corresponds 
to the conservation or invariance of a component of angular momentum be¬ 
cause the kinetic part of Eq. (18.38) corresponding to an angular coordinate 
represents the torque. Similarly, invariance under translation leads to con¬ 
servation of linear momentum. Noether’s theorem (see Additional Reading in 
Chapter 4) is a generalization of this invariance (symmetry)—the conservation 
law relation. 


Biographical Data 

Hamilton, William R. Hamilton, an English mathematician and physicist, 
was born in 1805 in Dublin, Ireland, and died in 1865 near Dublin. At age 
22, he became a professor of astronomy of Trinity College of Dublin while 
still an undergraduate. In 1843, he discovered (noncommuting) quaternions, 
developing them lifelong in addition to the famous variational principle of 
analytical mechanics named after him. 


EXAMPLE 18.2.3 


Moving Particle—Cartesian Coordinates Consider Eq. (18.36) for one 
particle with kinetic energy 


T = 


-mx 2 

2 


(18.39) 
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and potential energy V (x), in which, as usual, the force is given by the negative 
gradient of the potential, 


F{x) = 


dV {pc) 
dx 


(18.40) 


From Eq. (18.38), 


^-(mx) 

dt 


d{T-V) 

dx 


= mx — F{x) = 0, 


which is Newton’s second law of motion. 


(18.41) 


EXAMPLE 18.2.4 


Moving Particle—Circular Cylindrical Coordinates Now consider a par¬ 
ticle in the tn/-plane described in cylindrical coordinates. The kinetic energy 
is 

T = lm{x 2 + y 2 ) = ™(,p 2 + pV), (18.42) 

and we take V = 0 for simplicity. 

The transformation of x 2 + y 2 into circular cylindrical coordinates could 
be carried out by taking x(p , <p) and y(p, <p) [Eq. (2.7)] and differentiating with 
respect to time and squaring. It is much easier to interpret x 2 + y 2 as v 2 and 
write the components of v as p{ds p /dt) = pp, and so on. (The ds p is an incre¬ 
ment of length, with p changing by dp and <p remaining constant. See Example 
2 . 2 . 1 .) 

The Lagrangian equations for p and <p yield, respectively, 
d 9 d 9 

— {mp) — mpip — 0, — {mp ip) = 0. (18.43) 

dt dt 

The second equation is a statement of conservation of angular momentum. The 
first may be interpreted as radial acceleration 8 equated to centrifugal force (a 
“generalized force”). ■ 


EXERCISES 

18.2.1 (a) Develop the equations of motion for the Lagrangian 

L = 1 m{x 2 + y 2 ). 

(b) In what sense do your solutions minimize the integral L dtl 
Compare the result for your solution with x — const., y = const. 

18.2.2 From Lagrange’s equations of motion [Eq. (18.38)], show that a system 
in stable equilibrium has a minimum potential energy. 

18.2.3 Write out Lagrange’s equations of motion of a particle in spherical polar 
coordinates for a potential V equal to a constant. Identify the terms 
corresponding to (a) centrifugal force and (b) Coriolis force. 


’Here is a second method of handling Exercise 2.2.8. 
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18.2.4 The spherical pendulum consists of a mass on a wire of length l, free 
to move in polar angle 6 and azimuth angle <p (Fig. 18.7). 

(a) Set up the Lagrangian for this physical system. 

(b) Develop Lagrange’s equations of motion. 

18.2.5 Show that the Lagrangian 



leads to a relativistic form of Newton’s second law of motion, 

( m 0 Vj ^ _ F 

dt V-y/T — v 2 /c 2 ) 

in which the force components are F, = —3 V/dXi. (Compare with 
Chapter 4, Vector Analysis in Minkowski Space-Time.) 

18.2.6 The Lagrangian for a particle with charge q in an electromagnetic field 
described by scalar potential (p and vector potential A is 

L = \mv 2 — q<p + qA ■ v. 

Find the equation of motion of the charged particle. 

Hint. (d/dt)Aj = dAj/dt + J),-(3Aj/tl.r.;).'!;,;. The dependence of the 
force fields E and B on the potentials <p and A is developed in Section 
1.12 (compare Exercise 1.12.6). 


ANS. mxi — q [E + v x B] £ . 

18.2.7 Consider a system in which the Lagrangian is given by 


L(Mu qd = Tfa, qO - Vfe), 
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where g,- and q t represent sets of variables. The potential energy V is 
independent of velocity and neither T nor V have any explicit time 
dependence. 

(a) Show that 


d 

dt 




= 0. 


(b) The constant quantity 


\ ' . dL 

? 91 wr L 


defines the Hamiltonian H. Show that under the preceding assumed 
conditions, H = T + V, the total energy. 

Note. The kinetic energy T is a quadratic function of the q L . 


18.3 Several Independent Variables 


Sometimes, the integrand / of Eq. (18.1) will contain one unknown function 
u, which is a function of several independent variables, u = u(pc, y, z), for the 
three-dimensional case, for example. Equation (18.1) becomes 


J = 



f[U, Mx, Uy, 


u z , x, y, z\dx dy dz, 


(18.44) 


u x = du/dx, and so on. The variational problem is to find the function u(x, y, z) 
for which J is stationary, 


9 J 

8J = a — 
da 


= 0. 


t=o 


Generalizing Section 18.1, we let 


(18.45) 


u(x, y, z, a ) = u(x, y, z, 0) + aq(x, y, z ), (18.46) 


where u(pc, y, z,a = 0) represents the (unknown) function for which Eq. 
(18.45) is satisfied, whereas again q{x, y, z) is the arbitrary deviation that de¬ 
scribes the varied function u(x, y, z, a). This deviation, q(x, y, z), is required 
to be differentiable and to vanish at the end points. Then from Eq. (18.46), 

u x (x, y, z, a) = u x (x, y, z, 0) + arj x , (18.47) 


and similarly for u y and u z . 

Differentiating the integral Eq. (18.44) with respect to the parameter a and 
then setting a — 0, we obtain 


dJ 


da 


a=0 



3 / 3 / 3 / 3 f \ 

x-d + x—dx + x—dy + X~dz 

du du x du y du z ) 


dx dy dz — 0. 


(18.48) 
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EXAMPLE 18.3.1 


Again, we integrate each of the terms ( [df/dUi)rn by parts. The integrated 
part vanishes at the end points (because the deviation rj is required to go to 
zero at the end points) and 9 


//' 

(df_ 

AJV _ 


l_ df\ 

\du 

dx du x 

dydUy 

dz du?/ 


(18.49) 


Since the variation rj (x, y, z) is arbitrary, the term in large parentheses is set 
equal to zero. This yields the Euler equation for (three) independent variables, 

9 / 9 9 / 9 9 / 

du dxdu x dydUy 


9 9 / „ 

-— = 0. 

dz du z 


(18.50) 


Laplace’s Equation An example of this sort of variational problem is pro¬ 
vided by electrostatics. The energy of an electrostatic field is 

energy density = \ eE 2 , (18.51) 

where E is the usual electrostatic force field. In terms of the static potential cp, 

energy density = |e(V^) 2 . (18.52) 


Now let us impose the requirement that the electrostatic energy (associated 
with the field) in a given volume be a minimum. (Boundary conditions on E 
and (p must still be satisfied.) We have the volume integral 10 



'-/// 

(V <pf dx dy dz = JJJ (<pl + <p 2 y + 

(p 2 )dx dy dz. 

(18.53) 

With 







fOp, <Px, <Py, <Pz, X, y, Z) = <pl + <p 2 y + 

‘in¬ 

(18.54) 

the function <p replacing the u of Eq. (18.50), Euler’s 

equation [Eq. 

(18.50)] 

yields 







^(SPxx (Pyy “ 1 “ <Pzz) — 0 


(18.55) 

or 







V 2 (p(x, y, z') = 0, 


(18.56) 


which is Laplace’s equation of electrostatics. 

Closer investigation shows that this stationary value is indeed a minimum. 
Thus, the demand that the field energy be minimized leads to Laplace’s PDE. ■ 


9 Recall that 3/3xis a partial derivative, where j/and z are held constant. However, 3/3xis also a 
total derivative in that it acts on implicit x-dependence as well as on explicit x-dependence. In 
this sense, 

3 ( 3f \ _ 3 2 / , 3 2 f S 2 / , 3 2 / , 3 2 / 

3x \3u x ) 3x3 u x 3u3u x x 3m 2 xx 3u y 3u x xy 3u z 3u x xz 

10 The subscript x indicates the x-partial derivative, not an x-component. 
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EXERCISES 

18.3.1 The Lagrangian for a vibrating string (small-amplitude vibrations) is 

L — J ( \p% (.( — gt u x )dx, 

where p is the (constant) linear mass density and r is the (constant) 
tension. The ^-integration is over the length of the string. Show that 
application of Hamilton’s principle to the Lagrangian density (the in¬ 
tegrand), now with two independent variables, leads to the classical 
wave equation 

3 2 u p 3 2 u 
dx 2 x 3 1 2 ' 

18.3.2 Show that the stationary value of the total energy of the electrostatic 
field of Example 18.3.1 is a minimum. 

Hint. Use Eq. (18.48) and investigate the a 2 terms. 


18.4 Several Dependent and Independent Variables 


In some cases, our integrand / contains more than one dependent variable 
and more than one independent variable. Consider 

/ = f[p(x, y, Z), Pxj P y , Pz, y, z), Qx, q y , Qz, r(x, y, z), r x , r y , r z , X, y, z], 

(18.57) 


We proceed as before with 

pipe, y, z, a) = p(x, y, z, 0) + ot%(x, y, z), 

q(x, y, z, a ) = q(x, y, z, 0) + ap(x, y, z), (18.58) 

r(pc, y, z, a ) = r(x, y, z, 0) + at;(pc, y, z), and so on. 


Keeping in mind that £, rj, and ( are independent of one another, as were the rj; 
in Section 18.3, the same differentiation and then integration by parts leads to 

3 / 3 3 / 3 3 / 3 3 / 

dp dxd p x dydpy dzdp z 

with similar equations for functions q and r. Replacing p,q,r, ... with y-, and 
x, y, z, ... with Xi, we can put Eq. (18.59) in a more compact form: 


= 0, 


(18.59) 


= 4=1,2,..., 

rill: \ till ■ / 


3 / 3/ 


3 Vi 


3 Xj ydy, 


(18.60) 


in which 


yp = 


dyi 

3 Xi 


An application of Eq. (18.59) appears in Section 18.5. 
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Relation to Physics 


The calculus of variations as developed so far provides an elegant descrip¬ 
tion of a wide variety of physical phenomena. The physics includes classical 
mechanics in Section 18.2; relativistic mechanics, Exercise 18.2.5; electro¬ 
statics, Example 18.3.1; and electromagnetic theory in Exercise 18.4.1. The 
convenience should not be minimized, but at the same time we should be 
aware that in these cases the calculus of variations has only provided an alter¬ 
nate description of what was already known. Variational problems in quantum 
mechanics are applications of the calculus of variations that are essential and 
highly useful. The situation does change with incomplete theories. 


• If the basic physics is not yet known, a postulated variational principle can 
be a useful starting point. 


EXERCISE 

18.4.1 The Lagrangian (per unit volume) of an electromagnetic field with a 
charge density p is given by 

C = ^ (e 0 E 2 - —B 2 ) - pep + pv • A. 

2 V Mo / 

Show that Lagrange’s equations lead to two of Maxwell’s equations. 
(The remaining two are a consequence of the definition of E and B in 
terms of A and <p.) 

Hint. Take A 1; A 2 , A 3 , and cp as dependent variables, x, y, z, and t as 
independent variables. Write E and B in terms of A and cp. 


18.5 Lagrangian Multipliers: Variation with Constraints 


In this section, the concept of a constraint is introduced. To simplify the treat¬ 
ment, the constraint appears first as a simple function rather than as an integral. 
In the initial part of this section, we are not concerned with the calculus of vari¬ 
ations, but then, with our newly developed Lagrangian multipliers, constraints 
are incorporated into the calculus of variations. 

Consider a function of three independent variables, / (x, y, z). For the func¬ 
tion / to be an extreme or saddle point, 


df = 0 . 

The necessary and sufficient condition for this is 

df = df = df =0 

dx 3 y dz 

because the total variation 


3 / 3 / 3 / 

df - f-dx+ f~dy+ -f-dz , 
ox oy oz 


where dx, dy, dz vary independently. 


(18.61) 

(18.62) 

(18.63) 
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Often in physical problems the variables x, y, z are subjected to constraints 
so that they are no longer all independent. It is possible, at least in principle, to 
use each constraint to eliminate one variable and to proceed with a new and 
smaller set of independent variables. 

The use of Lagrangian multipliers is an alternate technique that may be 
applied when this elimination of variables is inconvenient or undesirable, as 
shown in Example 1.5.4. Let our equation of constraint be 


cp(x, y , z) = 0, 


(18.64) 


from which z(x, y) can be extracted as a function of x, y, if x, y are taken as 
the independent coordinates. Returning to Eq. (18.61), Eq. (18.62) no longer 
follows because there are now only two independent variables. If we take x 
and y as these independent variables, dz is no longer arbitrary. From the total 
differential dip = 0, we then obtain 


d<P , dip dip 

- dz= — dx-\ - dy 

dz dx dy 


(18.65) 


and therefore 


df 


df 


df = —dx+ —dy+ AI -— dx+ —dy 


dx 


dy 


dip 


dip 


dx 


dy 


X = - 


fz 

Vz 


assuming that <p z = dip/dz f 0. Here, we have written the partial derivatives 
as fz, <Pz- The condition df = 0 that / be stationary can then be written as 


df dip , 

-b X — )dx + 

dx dx ' 


jdy= 0. 

In other words, if our Lagrangian multiplier X is chosen so that 


df dip 

— + X — 
dy dy 


(18.66) 


df dip „ 
— +X— - 0, 
dz dz 


(18.67) 


then dx and dy are arbitrary and the quantities in parentheses in Eq. (18.66) 
must vanish, 


— + X— — 0, 
dx dx 


df dip „ 
— +X— = 0. 
dy dy 


(18.68) 


Equations (18.67) and (18.68) are equivalent to 


df+Xdtp= (^+X^\dx + (^+X^\dy + (^+X^\dz= 0. (18.69) 

\dx dx / \dy dy J \dz dz J 

When Eqs. (18.67)-(18.69) are satisfied, df = 0 and / is an extremum. 
Notice that there are now four unknowns: x, y, z, and X. The fourth equation, 
of course, is the constraint equation (18.64). We want only x, y, and z so that 
X need not be determined. (However, often X has a real physical meaning that 
depends on the problem.) For this reason, X is sometimes called Lagrange’s 
undetermined multiplier. This method will fail if all the coefficients of X vanish 
at the extremum, dip/dx, dip/dy, dip/dz = 0. It is then impossible to solve for X. 
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EXAMPLE 18.5.1 


Particle in a Box As an example of the use of Lagrangian multipliers, con¬ 
sider the quantum mechanical problem of a particle (mass rrt) in a box. The 
box is a rectangular parallelepiped with sides a, b, and c. The ground state 
energy of the particle is given by 


h 2 / 1 1 1 \ 
8 m\a 2 6 2 c 2 / 


(18.70) 


We seek the shape of the box that will minimize the energy E, subject to 
constraint that the volume is constant, 


V(a,b,c) = abc = k. (18.71) 

With /(a, b, c) = E(a, b, c) and ip(a, b, c) = abc — k = 0, we obtain 


dE dip h 2 
9a + ^ 9 a-“W +lftC -°- 


(18.72) 


Also, 


4 mb 3 


+ Xac = 0, 


4mc 3 


+ Xab = 0. 


Multiplying the first of these expressions by a, the second by b, and the 
third by c, we have 


Xabc 


h 


h 2 


h 2 


(18.73) 


4ma 2 4mb 2 4mc 2 

Therefore, our solution is 

a — b = c, a cube. (18.74) 

Notice that X has not been determined but follows from Eq. (18.73). ■ 



Variation with Constraints 


As in the preceding sections, we seek the path that will make the integral 


J = I f { yi ’ (18 ' 75) 

stationary. Here and in the following, a sum over the index j is understood. 
This is the general case in which Xj represents a set of independent variables 
and yi a set of dependent variables. Again, 


SJ = 0. 


(18.76) 


Now, however, we introduce one or more constraints. This means that the ?/,■ 
are no longer independent of each other. Not all the rji may be varied arbitrarily 
and Eqs. (18.50) or (18.60) would not apply. The constraint may have the form 


(PkiVi, %j) — 0. 


(18.77) 
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In this case we may multiply by a function of Xj, for example, X k (pcj), and 
integrate over the same range as in Eq. (18.75) to obtain 

J X k QVj)(p k (jji , Xj)dXj = 0. (18.78) 

Then clearly 

S J X k (xj)(p k {yi, Xj)dXj = 0. (18.79) 

Alternatively, the constraint may appear in the form of an integral 

J VkiVi, dyi/dXj, xj)dxj = constant, (18.80) 

generalizing Eq. (18.77). We may introduce any constant Lagrangian multi¬ 
plier, and again Eq. (18.79) follows—now with X a constant. 

In either case, by adding Eqs. (18.76) and (18.79), possibly with more than 
one constraint, we obtain 


S 



dyi 
3 Xj 



y, 'XkVk 


dxj = 0 . 


(18.81) 


The Lagrangian multiplier X k may depend on Xj when Xj) is given in 
the form of Eq. (18.77). 

Treating the entire integrand as a new function 


we obtain 


9 



3 y% 
3 x i 



9 



dyi 

dx, 


-,Xj 


f + x k (p k . 

k 


(18.82) 


If we have N yi (i — 1, 2,..., N ) and m constraints (k — 1,2,... to), N — to 
of the rji may be taken as arbitrary. For the remaining to ?h, the X k may, in prin¬ 
ciple, be chosen so that the remaining Euler-Lagrange equations are satisfied, 
completely analogous to Eq. (18.67). The result is that our composite function 
g must satisfy the usual Euler-Lagrange equations 


3 9 _ 3 3 9 

dyt y dXj (dy-i/dXj) 


(18.83) 


with one such equation for each dependent variable y h [compare Eqs. (18.50) 
and (18.60)]. These Euler equations and the equations of constraint are then 
solved simultaneously to find the function yielding a stationary value. 

Note that from the form of Eqs. (18.67) and (18.69), we could identify / as 
the function, taking an extreme value subject to <p, the constraint, or identify 
/ as the constraint and (p as the function. 
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EXAMPLE 18.5.2 


If we have a set of constraints cp k , then Eqs. (18.67) and (18.69) become 

3 f V—V dWlr 

+ J2 Xk ^~ = 0 > i =1 >2, 

OX; z —' a-r, 


' dXi 


with a separate Lagrange multiplier X k for each <p k . 


Maximum Surface Tension Given two points —xo, Xo > 0 on the x-axis, 
find a curve y(x) of fixed length L > 0 so that the area between the x-axis and 
the curve is a maximum. The boundary condition is y(—x o) = 0 = y{x o). 

We start from the surface formula 

/ Xo 

y{x)dx — maximum 

-Xo 

under the constraint of constant length 


J-oc 0 


x 2 + dy 2 



1 + ^ x — 


which leads to the integrand 


f + Xcp = y+X^l+ y 2 

of the problem. Here, X is the constant Lagrange multiplier, y is the dependent 
variable, and x is the independent variable. The Euler equation is 

d Xy x 

1 - = = 0 . 

dx yi + y‘i 

Integrating it yields 

= x + a, a = const., 

v/TT^f 

which can be solved for y x : 

x + a 

y x = ± - . 

y/x 2 — (x + a) 2 

Integrating again yields 


y(pc) = ±sjx 2 — (x+ a) 2 + b, b = const. 


The bo undary c onditions lead to (xo + a) 2 = (a — Xo) 2 ; that is, a = 0 and 
b — =p Jx 2 — Xq. Let us choose y > 0. Note that y' = y, becomes infinite at 
an interior point x unless X 2 > Xq. The curve y(x) is a circular arc through 
the points (—xo, 0), (x<j, 0) of length L = 2X arcsin(xo//.). When X = xo, then 
L — 2xo arcsin 1 = xqjt, a half circle with maximal area. When X oc, or 
Xo <SC X, then the arc becomes the closed circle x 2 + (jj — X ) 2 = X 2 with radius 
X and length 2Xtt and the x-axis its tangent; that is, xo —»■ 0 and the points 
approach each other. ■ 
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Lagrangian Equations 


In the absence of constraints, Lagrange’s equations of motion [Eq. (18.38)] 
were found to be 11 


d dL 3 L 
dt dqi 3 q t 


with t (time) the one independent variable and qi(t ) (particle position) a set of 
dependent variables. Usually, the coordinates r/ f are chosen to eliminate the 
forces of constraint, but this is not necessary and not always desirable. In the 
presence of constraints <p fc (q,;, t) — 0 that are independent of the velocities r/,-, 
Hamilton’s principle is 


8 



it Q-it 0 T~ ^ 0 

k 


dt = 0, 


(18.84) 


and the constrained Lagrangian equations of motion are 


d 3 L 3 L * 

“77 ~7T- 7 — / , ^ikL-k 

dt dqi dqi ^ 


(18.85) 


Here, the right-hand side is the total force of constraint corresponding to the 
coordinate q h . The coefficient a* is given by 

d<Pk 

a ik = —- (18.86) 

dqi 

If qi is a length, then a,/,/./, (no summation) represents the force of the fcth 
constraint in the q -, direction, appearing in Eq. (18.85) in exactly the same way 
as — dV/dqi. 


EXAMPLE 18.5.3 


Simple Pendulum To illustrate, consider the simple pendulum, a mass to, 
constrained by a wire of length l to swing in an arc (Fig. 18.8). In the absence 
of the constraint 


ipi =r — l — 0 


(18.87) 


there are two generalized coordinates r and 6 (motion in vertical plane). The 
Lagrangian is (taking the potential V to be zero when the pendulum is hori¬ 
zontal, 0 — ix /2) 

L = T — V = i m(f 2 + r 2 0 2 ) + mgr cos 0. (18.88) 


By Eq. (18.85), the equations of motion are 


d dL dL 
dt dr dr 


d dL 
dt 3 9 



(Q r 1 = 1 , 0-01 = 0 ), 


(18.89) 


u The symbol q is customary in classical mechanics. It serves to emphasize that the variable is 
not necessarily a Cartesian variable (and not necessarily a length). 
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Figure 18.9 

A Particle Sliding 
on a Cylindrical 
Surface 



or 

d • 9 

— (mr) — mrd" — mg cos 6 = A.i, 
dt 

d o ■ 

— (mr0) + mgr sin 6 = 0. (18.90) 

dt 

Substituting in the equation of constraint (r = l, r — 0), we have 

ml6 2 + mgcos6 ——Xi, ml 2 0 + mgl sin 0 = 0. (18.91) 

The second equation of Eq. (18.91) may be solved for 6(f) to yield simple har¬ 
monic motion if the amplitude is small (sin 6 9), whereas the first equation 

expresses the tension in the wire in terms of 9 and 9. 

Note that since the equation of constraint [Eq. (18.87)] is in the form of 
Eq. (18.77), the Lagrange multiplier X may be (and here is) a function of t 
[because 9 = 9(t), a function]. ■ 


Sliding off a Log Closely related to this is the problem of a particle sliding 
on a cylindrical surface. The object is to find the critical angle 9 C at which the 
particle flies off from the surface. This critical angle is the angle at which the 
radial force of constraint goes to zero (Fig. 18.9). 


EXAMPLE 18.5.4 
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We have (with 9 = 0 in the upward vertical direction) 
L = T — V = \m(r 2 + r 2 0 2 ) — mgr cos 9 
and the equation of constraint 

cpi = r — l = 0. 

Proceeding as in Example 18.5.3 with a r i = 1, 

mf — mr9 2 + mg cos 9 = X 1 (9), 
mr 2 9 + 2 mrfO — mgr sin 9 — 0, 


(18.92) 

(18.93) 

(18.94) 


in which the constraining force is a function of the angle 9. 12 Since 
r — l, f = r — 0, the radial equation of motion reduces to 


mW 2 + mg cos 9 = Ai(0), 


ml“9 — mgl sin0 = 0. 

Differentiating with respect to time using the chain rule 

dfm df(0 ). 

-v. 


(18.95) 

(18.96) 


dt 


dO 


we obtain 


gD.](0) • 

-2 ml99 — mgsm99 = — rz—9. 


If 9 ^ 0, we can drop this factor to find 

—2 mlQ — mg sin 9 = 


d9 


dki(9) 

d9 


Multiplying this result by l /2 and adding it to Eq. (18.96) eliminates the 9 term 
and yields 

dXi(9) 


-3mg sin 9 = 


d9 


which we integrate to get 

a i (Q ) = 3mg cos 9 + C, C = const. 

Using the initial condition, when the particle is at rest, 0(0) = 0, 9 = 0 at t = 0, 
in the radial equation of motion [Eq. (18.94)] yields 

>m(0) = mg, 

consistent with A.i being the radial force, and we find 


C = —2,mg. 

12 Note that Ai is the radial force exerted by the cylinder on the particle. Consideration of the 
physical problem should show that Ai must depend on the angle 0. We permitted A = A(t). Now 
we are replacing the time dependence by an (unknown) angular dependence. 
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The particle to will stay on the surface as long as the force of constraint is 
nonnegative; that is, as long as the surface has to push outward on the particle, 

= 3mg cos 0 — 2mg > 0. (18.97) 

The critical angle lies where /.i (0 C ) = 0, the force of constraint going to zero: 

cos0 c = §, or 0 C = 48° 11' (18.98) 

from the vertical. At this angle (neglecting all friction) our particle takes off. 

This result can be obtained more easily by considering a varying centripetal 
force furnished by the radial component of the gravitational force. The example 
was chosen to illustrate the use of Lagrange’s undetermined multiplier in a 
simple physical system. ■ 


EXAMPLE 18.5.5 


The Schrodinger Wave Equation As a final illustration of a constrained 
minimum, let us find the Euler equations for a quantum mechanical problem 


S 



y, z)H\f/(x, y, z) dx dy dz — 0, 


(18.99) 


with the constraint 



i lr*\jf dxdydz = 1. 


(18.100) 


Equation (18.99) is a statement that the energy of the system is stationary, 
with H being the quantum mechanical Hamiltonian for a particle of mass to, a 
differential operator, 


Tt , 

H=~ — V 2 + V0r, y,z). (18.101) 

2TO 

Equation (18.100), the constraint, is the condition that there will be exactly 
one particle present; i jr is the usual wave function, a dependent variable, and 
ijr*, its complex conjugate, is treated as a second dependent variable. 13 

The integrand in Eq. (18.99) involves second derivatives, which can be 
converted to first derivatives by integrating by parts: 


/ 


9 2 i/f d\l/ 

r^dx=r 1 r 

dx A dx 


f 9yc H 

/ T— ( l' x - 

J dx dx 


(18.102) 


We assume either periodic boundary conditions (as in the Sturm-Liouville 
theory; Chapter 9) or that the volume of integration is so large that -0- and 
i/f* vanish strongly at the boundary. 14 Then the integrated part vanishes and 


13 Compare Section 6.1. 
14 lim ri/f(r) = 0. 

r—>oo 
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Eq. (18.99) may be rewritten as 


fff 


-Vl/r* ■ Vl/ r + V\[r*\l/ 

2 m 


dxdydz — 0. 


(18.103) 


The function g of Eq. (18.82) is 


h z 


g = -Vl jr* ■ Vi/r + V\[r*\lr — Xl//*l[r 

2m 

h 2 


(18.104) 


dy dijfy dzdx/f* 


= 0 . 


again using the subscript x to denote d/dx. For yy v - = i//*, Eq. (18.83) becomes 

dg 3 3 g 3 3 g 3 3 g 

3 ir* dxdij/* 

This yields 

h 2 


or 


V \l/ - X\// - — (l/tra +ifyy+ is zz ) = o 
2m 


hr 

- 'V 2 \[r + V\jf = X\[r. 

2m 


(18.105) 


Reference to Eq. (18.101) enables us to identify X physically as the energy of 
the quantum mechanical system. With this interpretation, Eq. (18.105) is the 
celebrated Schrodinger wave equation. This variational approach is more than 
just a matter of academic curiosity. It provides a very powerful method of ob¬ 
taining approximate solutions of the wave equation (Rayleigh-Ritz variational 
method; Section 18.6). ■ 


EXERCISES 

The following problems are to be solved by using Lagrangian multipliers. 

18.5.1 The ground state energy of a particle in a pillbox (right-circular cylin¬ 
der) is given by 

h 2 / (2.4048) 2 t r 2 \ 

~ 2m\ R 2 + H 2 )’ 

where R is the radius, H is the height of the pillbox, and 2.4048 is 
the first zero of Jo(r ). Find the ratio of R to II that will minimize the 
energy for a fixed volume. 

18.5.2 Find the ratio of R (radius) to H (height) that will minimize the total 
surface area of a right-circular cylinder of fixed volume of the previous 
problem. 
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18.5.3 The U.S. Post Office limits first-class mail to Canada to a total of 
36 in., length plus girth. Using a Lagrange multiplier, find the maximum 
volume and the dimensions of a (rectangular parallelepiped) package 
subject to this constraint. 

18.5.4 For a lens of focal length f the object distance p and the image 
distance q are related by l/p + 1 /q — 1//. Find the minimum object- 
image distance (p + q) for fixed /. Assume real object and image (p 
and q both positive). 

18.5.5 You have an ellipse (x/a'f + ( y/b ) 2 = 1 . Find the inscribed rectangle 
of maximum area. Show that the ratio of the maximum rectangle area 
to the area of the ellipse is (2,/n) = 0.6366. 

18.5.6 A rectangular parallelepiped is inscribed in an ellipsoid of semiaxes 
a, b, and c. Maximize the volume of the inscribed rectangular paral¬ 
lelepiped. Show that the ratio of the maximum volume to the volume 
of the ellipsoid is 2 /jt V3 ~ 0.367. 


18.5.7 A deformed sphere has a radius given by r = ro{ag + c^THcos O')}, 
where «o ^ 1 and \ao\ <<. |orol- Show that the area and volume are 


A = 4nrlal 


1 + ^ — 


V = 


4jrr 0 3 


1 + - — 


Terms of order af have been neglected. 

(a) With the constraint that the enclosed volume be held constant, 
V = 4izV q /3, show that bounding surface of minimum area is a 
sphere (ao = 1, a 2 — 0). 

(b) With the constraint that the area of the bounding surface be held 
constant (i.e., A = 4irpf), show that the enclosed volume is a 
maximum when the surface is a sphere. 


18.5.8 Find the maximum value of the directional derivative of <p(x, y, z), 


dip 

ds 


dip dip dip 

— cos a H-cos pH-cos y, 

dx dy dz 


subject to the constraint 


cos 2 a + cos 2 p + cos 2 y = 1. 

ANS. 



= \Vv\- 


Concerning the following exercises, note that in a quantum- 
mechanical system there are distinct quantum states between en¬ 
ergy E; and E-, + (IE,. The problem is to describe how n, particles are 
distributed among these states subject to two constraints: 

(a) fixed number of particles, 


'Y^,n i = n. 
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(b) fixed total energy, 

^2 n i E i = E. 


18.5.9 For identical particles obeying the Pauli exclusion principle, the prob¬ 
ability of a given arrangement is 


w FD =n 

i 


9i- 

ni\(gi - re*)!' 


Show that maximizing W F d subject to a fixed number of particles and 
fixed total energy leads to 


1 g^l+^2Ei \ 

With /,] — —E 0 /kT and Xo — 1/kT, this yields Fermi-Dirac statistics. 
Hint. Try working with in W and using Stirling’s fonnula (Section 
10.3). The justification for differentiation with respect to n, is that 
we are dealing here with a large number of particles, A nilni 1. 


18.5.10 For identical particles but no restriction on the number in a given 
state, the probability of a given arrangement is 


w BE =n 

i 


(nj + - 1)! 

m\(gi - 1 )! 


Show that maximizing Wbe, subject to a fixed number of particles and 
fixed total energy, leads to 


% qAi+AzEi — ^ 

With /.j = —Eo/kT and X 2 = 1 /kT, this yields Bose-Einstein statis¬ 
tics. 

Note. Assume that g, t>> 1. 


18.5.11 Photons satisfy Bose-Einstein statistics and the constraint that total 
energy is constant. They clearly do not satisfy the fixed number con¬ 
straint. Show that eliminating the fixed number constraint leads to the 
foregoing result but with /.] = 0. 

18.5.12 A particle, mass m, is on a frictionless horizontal surface. It is con¬ 
strained to move so that Q = cot (rotating radial arm, no friction). With 
the initial conditions 


t = 0, r = ro, r = 0, 

(a) find the radial positions as a function of time; 

ANS. r(t) = ro cosh cot. 

(b) find the force exerted on the particle by the constraint. 


ANS. = 2 mrm = 2mro&> 2 sinh wt. 
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18.5.13 A point mass mis moving over aflat, horizontal, frictionless plane. The 
mass is constrained by a string to move radially inward at a constant 
rate. Using plane polar coordinates (p, (p), p = p 0 — kt, 

(a) Set up the Lagrangian. 

(b) Obtain the constrained Lagrange equations; 

(c) Solve the (^-dependent Lagrange equation to obtain &>(£), the an¬ 
gular velocity. What is the physical significance of the constant of 
integration that you get from your “free” integration? 

(d) Using the o>(t) from part (b), solve the p-dependent (constrained) 
Lagrange equation to obtain /.(£). In other words, explain what is 
happening to the force of constraint as p -> 0. 

18.5.14 A flexible cable is suspended from two fixed points. The length of the 
cable is fixed. Find the curve that will minimize the total gravitational 
potential energy of the cable. 

ANS. Hyperbolic cosine. 

18.5.15 A fixed volume of water is rotating in a cylinder with constant angular 
velocity co. Find the water surface that will minimize the action. 

ANS. Parabola. 


18.5.16 (a) Show that for a fixed-length string, the figure with maximum area 
enclosed by it is a circle. 

(b) Show that for a fixed area, the curve with minimum perimeter is 
a circle. 

Hint. The radius of curvature R is given by 

R= (r 2 + r 2 ) 3/2 / (rr eff - 2r 2 - r 2 ). 

Note. The problems of this section, variation subject to constraints, 
are often called isoperimetric. The term arose from problems of 
maximizing area subject to afixed perimeter, as in Exercise 18.5.16(a). 


18.5.17 Show that requiring J, given by 

r b 9 

J = / [v@)l)l - q(x)y 2 ]dx, 

J a 

to have a stationary value subject to the normalizing condition 

r b 2 

/ y w(x)dx = 1 
J a 

and boundary condition 

WxV \a= o 

leads to the Sturm-Liouville equation of Chapter 9: 


d 


dy 


— V— +qy+Xwy= 0. 
dx\ dx. 


Give a physical interpretation of J. 
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Note. The boundary condition is used in Section 9.1 in establishing the 
Hermitian property of the operator. 


18.6 Rayleigh-Ritz Variational Technique 


Exercise 18.5.17 opens up a connection between the calculus of variations 
and eigenfunction-eigenvalue problems. We may rewrite the expression of 
Exercise 18.5.17 as 


F[y(x)] = 


fg (VVX - qy 2 )dx 
fa y 2 w dx 


(18.106) 


The normalization integral appears in the denominator instead of in a con¬ 
straint. The quantity F, a functional of the function y(x), is homogeneous 
in y and independent of the normalization of y , and it corresponds to the 
constrained stationary value of J. Then from Exercise 18.5.17, with / = 
py 2 — qy 2 — ky 2 and Euler’s equation we find that y(x) satis¬ 

fies the Sturm-Liouville equation 

~r( p ~r) + Qy+^wy = o, (I8.107) 

dx \ dx J 


with X now the eigenvalue (whereas in Exercise 18.5.17 it was a Lagrangian 
multiplier). Our optimum function y(x) is such that J and F take on a stationary 
value. Integrating the first term in the numerator of Eq. (18.106) by parts and 
using the boundary condition, 


VVxV I a= 0, 


(18.108) 


we obtain 


F[y{pc)\ = -f y\kr{p-ff) +qy\ dx / f y 2 ^ dx - (I8.109) 

Ja [ \ &% / \ I J a 

Then substituting in Eq. (18.107), the stationary values of F[y(x)\ are given by 


F[y n {x)\ — k n , 


(18.110) 


with X n the eigenvalue corresponding to the eigenfunction y n . Equation 
(18.110) with F given by either Eq. (18.106) or (18.109) forms the basis of 
the approximate Rayleigh-Ritz method for the computation of eigenfunctions 
and eigenvalues. 


Biographical Data 

Rayleigh, John William Strutt, Lord. Rayleigh, an English physicist, was 
bom in 1842 in Essex, England, and died in 1919 in Essex. He developed 
the scattering of light by atmospheric dust, explaining the blue color of the 
sky; contributed to black-body radiation at long wave lengths; and analyzed 
sound and water waves. He was awarded the physics Nobel prize in 1904 
for the discovery of argon. 










862 


Chapter 18 Calculus of Variations 


Ground-State Eigenfunction 


Suppose that we seek to compute the ground-state eigenfunction yo and eigen¬ 
value 15 Ao of some Hermitian operator with a lower bound that need not be 
positive or zero of some complicated atomic or nuclear system. The classical 
example for which no exact solution exists is the helium atom problem. The 
eigenfunction yo is unknown, but we assume that we can make a pretty good 
guess at an approximate function y so that mathematically we may write 16 


y = Vo + 


cm. 


(18.111) 


i= 1 


The c,; are small quantities compared to unity. (How small depends on how good 
our guess y was compared to ?/o-) The yi are orthonormalized eigenfunctions 
(also unknown); therefore, our trial function y is not normalized. 

Substituting the approximate function y into Eq. (18.109) and noting that 


/>( 


d 

dx 


V 


dx 


m 


dx = 


Xi&ij , 


(18.112) 


F[y(x)) = A °^= lC ^ - (18-113) 

1 + L-,i =1 c i 

Here, we have taken the eigenfunctions to be orthogonal since they are solu¬ 
tions of the Sturm-Liouville equation [Eq. (18.107)]. We also assume that yo is 
nondegenerate. Now we convert Eq. (18.113) to the form 


F[y(x)\ = 


*o(l + ££xC?) + E£iC?(Ai-Ao) 


c 2 

I W 


> Ao, 


(18.114) 


1 + £■=!■ 

where we neglect the second term in the numerator, which is nonnegative 
because A,- > Ao, to get the lower bound Ao. Equation (18.114) contains three 
important results: 


• Whereas the error in the eigenfunction y was of order O(Cj), the error in 
X is only of order 0(cf). Even a poor approximation of the eigenfunctions 
may yield an accurate calculation of the eigenvalue. 

• If Ao is the lowest eigenvalue (ground state), then since A* — Ao > 0, the 
expectation value of the operator, 

F[y(x)] = A > Ao, (18.115) 

or our approximation is always on the high side becoming lower, converging 
on Ao as our approximate eigenfunction y improves (c,; —»■ 0). Note that 
Eq. (18.115) is a direct consequence of Eq. (18.113). More directly, F[y(pc)] 
in Eq. (18.113) is the positively weighted average of the A* and, therefore, 
must be no smaller than the smallest A,;, to wit, Aq. 


lD This means that Ao is the lowest eigenvalue. It is clear from Eq. (18.106) that if p(x ) > 0 and 
<J Or) < 0 (compare Table 9.1), then F[y(x)] has a lower bound and this lower bound is nonnegative. 
Recall from Section 9.1 that w{x) > 0. 

16 We are guessing at the form of the function. The normalization is irrelevant. 
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• We do not know y\ in practice, we invent a y(x), adjusting its form to 
get the smallest F[y(x)\ possible. Often, parameters in y may be varied to 
minimize F and thereby improve the estimate of the ground-state energy Xq. 
The method can be extended to excited states by including orthogonality 
constraints to the ground state and lower lying excitations. 


EXAMPLE 18.6.1 


Vibrating String A vibrating string clamped at x = 0 and 1 satisfies the 
eigenvalue equation 


dry 

—^+Xy = 0 (18.116) 

dx “ 

and the boundary condition y{ 0) = y(l ) = 0. For this simple example, we rec¬ 
ognize immediately that yo(x) = sin ttx (unnormalized ) and /.q = jr 1 2 . However, 
let us try the Rayleigh-Ritz technique. 

With one eye on the boundary conditions, we try 


y{x ) = x(l - x). 


(18.117) 


Then with p = 1, q = 0, and w = 1, using y" = —2, Eq. (18.116) yields 


F[y(x)\ = 2 


fg x(l — x)dx 
fg x 2 (l — x) 2 dx 


1/3 

1/30 


= 10 . 


(18.118) 


This result, X = 10, is a fairly good approximation (1.3% error) 17 of Xq — 7t 2 ^ 
9.8696. The reader may have noted that y(x ) [Eq. (18.110)] is not normalized 
to unity. The denominator in F[y(x)\ compensates for the lack of unit nor¬ 
malization. In the usual calculation the eigenfunction would be improved by 
introducing more terms and adjustable parameters, such as 


y = x(l — x) + a 2 ^ 2 (l — xf. (18.119) 


It is convenient to have the additional terms orthogonal, but it is not necessary. 
The parameter a 2 is adjusted to minimize F[y(x)\. In this case, choosing 
do = 1.1353 drives F[y(x)} down to 9.8697, very close to the exact eigenvalue 
value. ■ 


EXAMPLE 18.6.2 


S-Wave Particle in an Infinite Spherical Square Well We start from the 
radial Schrodinger equation, 


d 2 U r, 

—— + k~u = 0 , r < a, 


1 ' The closeness of the fit may be checked by a Fourier sine expansion (compare Exercise 14.2.3 
over the half interval [0, 1] or, equivalently, over the interval [—1, 1], with y(x) taken to be odd). 
Because of the even symmetry relative to x = 1/2, only odd n terms appear: 

y{pc) = x(l — x) = 


sin3wx sin5wx 
sm nx 8 p I p f ‘ 
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with orbital angular momentum l = 0, potential V = 0, r < a, and V —> +oo 
for r > a. We call the energy eigenvalue 2 pE/h 2 = k 2 , where /i is the reduced 
mass of the particle, and u (r) = r\l/ (r) is r times the radial wave function. 

We take u(r) = r(l — r 2 /a 2 ) as our unnormalized trial wave function, 
imposing a node at r = a, where the potential becomes infinite. Then we 
compare the ODE for u with the Sturm-Liouville ODE [Eq. (18.107)], which 
yields p(x) = 1 , q(x) = 0, w(x) = 1 , and 


F[u(r )] = 


uu" dr 


/ 0 u 2 dr 

Integrating the numerator by parts, or Eq. (18.106), gives 

f“(v!) 2 dr 4a/5 21 

= 8a 3 /105 = 2^ 


F[u(r )] = 


using 

u' = 1 - 3r 2 /a 2 , 


/' 


u dr = 


f“u 2 dr 

c a 

= / r 2 
Jo 

8a 3 

105’ 


r 2\ 2 


dr =\%- 


2 a 5 
5 a 2 


f 


1 — 3 ~~ 


dr = -a. 
5 


a‘ 
7 a* 


Now we use the ODE u" = —k 2 u in the numerator of F to get the eigenvalue 
k 2 : 


F[u(r)} = 


~ fp uu"dr _ 

fo “ 2 dr 


21 

2a 2 ' 


Comparing to the exact solution k 2 = Tt 2 /a 2 , our estimate is 6% off. Exer¬ 
cise 18.6.5 addresses possible improvements. ■ 


EXERCISES 


18.6.1 The wave equation for the quantum-mechanical oscillator may be writ¬ 
ten as 

d 2 -Jf{x) 


dx 2 


+ (X — x‘— 0, 


with /. = 1 for the ground state [Eq. (13.4)]. Take 

* 1 - ( x 2 /a 2 ), x 2 < a 2 

0 , 


lA trial — 


9 ? 

ar > a * 


for the ground-state wave function (with a 2 an adjustable parameter) 
and calculate the corresponding ground-state energy. How much error 
do you have? 
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Note. Your parabola is really not a very good approximation to a Gaus¬ 
sian solution. What improvements can you suggest? 

18.6.2 The Schrodinger equation for a central potential may be written as 

h 2 ia + 1 ) 

£u(r) + 2Mr2 u(r) = Eu(r). 

The 1(1, + 1) term is the angular momentum barrier; it is derived from 
splitting off the angular dependence (Example 9.1.1). Treating this term 
as a perturbation, use the variational technique to show that E > E f) , 
where E 0 is the energy eigenvalue of Cuq = E 0 uo corresponding to 
l = 0. This means that the minimum energy state will have l — 0, zero 
angular momentum. 

Hint. You can expand u(r) as Uo(r) + l am, where Cm — E,Ui, 
Ei > E 0 . 

18.6.3 In the matrix eigenvector, eigenvalue equation 

Ar i = MTi, 

where X is an n x n Hermitian matrix. For simplicity, assume that its n 
real eigenvalues (Section 3.5) are distinct, A.i being the largest. If r is 
an approximation to ri, 

n 

r = r 1 + J]5 i r i , 

i =2 

show that 


and that the error in X i is of the order </; 1 2 ■ Take (5,- <C 1. 

Hint. The nr,- fonn a complete orthogonal set spanning the re-dimen- 
sional (complex) space. 

18.6.4 The variational solution of Example 18.6.1 may be refined by taking 
y = x(l — x) + « 2 .r 2 (l — xf. Using a numerical quadrature, calculate 
^■approx = E[y(x)] [Eq. (18.106)] for a fixed value of a 2 - Vary a 2 to 
minimize X. Calculate the value of a 2 that minimizes X and X itself to 
five significant figures. Compare your eigenvalue X with 7r 2 . 

18.6.5 (a) Improve the eigenvalue estimate of Example 18.6.2 using the trial 

wave function u = r(l — r/a)(l — r/b), b a parameter. Then de¬ 
termine b by minimizing F[u]. The result is expected to satisfy 
jr 2 /a 2 < k 2 < 10.5/a 2 . Explain why this is not the case. Plot F[u\ 
versus b and determine its minimum value by eye. Which alternative 
trial wave function does your result suggest? 

(b) Use the trial wave function u = r( 1 — r/a ) and show E[u\ = k 2 = 
10/a 2 , improving Example 18.6.2 from 6 to 1.3%. Also use the trial 
wave function u — r[l — (r/a) 6 ], b a parameter, to improve the 
estimate further. 
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Additional Reading 


Bliss, G. A. (1925). Calculus of Variations. Mathematical Association of Amer¬ 
ica Open Court, LaSalle, IL. As one of the older texts, this is still a valuable 
reference for details of problems such as minimum area problems. 

Courant, R., and Robbins, H. (1996). What Is Mathematics?, 2nd ed. Oxford 
Univ. Press, New York. Chapter 7 contains a fine discussion of the calculus 
of variations, including soap film solutions to minimum area problems. 

Lanczos, C. (1970). The Variational Principles of Mechanics, 4th ed. Univ. of 
Toronto Press, Toronto. Reprinted, Dover, New York (1986). This book is 
a very complete treatment of variational principles and their applications 
to the development of classical mechanics. 

Sagan, H. (1961). Boundary and Eigenvalue Problems in Mathematical 
Physics. Wiley, New York. Reprinted, Dover, New York (1989). This de¬ 
lightful text could also be listed as a reference for Sturm-Liouville theory, 
Legendre and Bessel functions, and Fourier series. Chapter 1 is an in¬ 
troduction to the calculus of variations with applications to mechanics. 
Chapter 7 picks up the calculus of variations again and applies it to eigen¬ 
value problems. 

Sagan, H. (1969). Introduction to the Calculus of Variations. McGraw-Hill, 
New York. Reprinted, Dover, New York (1983). This is an excellent intro¬ 
duction to the modem theory of the calculus of variations, which is more 
sophisticated and complete than his 1961 text. Sagan covers sufficiency 
conditions and relates the calculus of variations to problems of space 
technology. 

Weinstock, R. (1952). Calculus of Variations. McGraw-Hill, New York. 
Reprinted, Dover, New York (1974). A detailed, systematic development of 
the calculus of variations and applications to Sturm-Liouville theory and 
physical problems in elasticity, electrostatics, and quantum mechanics. 

Yourgrau, W., and Mandelstam, S. (1968). Variational Principles in Dynamics 
and Quantum Theory, 3rd ed. Saunders, Philadelphia. Reprinted, Dover, 
New York (1979). This is a comprehensive, authoritative treatment of vari¬ 
ational principles. The discussions of the historical development and the 
many metaphysical pitfalls are of particular interest. 





Chapter 19 



Nonlinear Methods 
and Chaos 


Our mind would lose itself in the complexity of the world if that complexity 
were not harmonious; like the short-sighted, it would only see the details, and 
would be obliged to forget each of these details before examining the next, 
because it would be incapable of taking in the whole. The only facts worthy 
of our attention are those which introduce order into this complexity and so 
make it accessible to us. 

—Henri Poincare 


19.1 Introduction 


The origin of nonlinear dynamics goes back to the work of the renowned 
French mathematician Henri Poincare on celestial mechanics at the turn of 
the century. Classical mechanics is, in general, nonlinear in its dependence on 
the coordinates of the particles and the velocities. Examples are vibrations 
with a nonlinear restoring force. The Navier-Stokes equations (see Yourgrau 
and Mandelstam in Additional Reading of Chapter 18) are nonlinear, which 
makes hydrodynamics difficult to handle. For almost four centuries, however, 
following the lead of Galilei, Newton, and others, physicists have focused on 
predictable, effectively linear responses of classical systems that usually have 
linear and nonlinear properties. 

Poincare was the first to understand the possibility of completely irregular 
or “chaotic” behavior of solutions of nonlinear differential equations (more 
precise definitions of chaos will be developed later) that are characterized 
by an extreme sensitivity to initial conditions: Given slightly different initial 
conditions, from small perturbations or errors in measurements, for example, 
solutions can grow exponentially apart with time so that the system soon 
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becomes effectively unpredictable or chaotic. This property of chaos is often 
called the “butterfly” effect and will be discussed in Section 19.3. Since the 
rediscovery of this effect by Lorenz in meteorology in the early 1960s, the held 
of nonlinear dynamics has grown tremendously. Thus, nonlinear dynamics and 
chaos theory have entered the mainstream of physics. 

Numerous examples of nonlinear systems have been found to display 
irregular behavior. Surprisingly, order in the sense of quantitative similarities 
as universal properties or other regularities may arise spontaneously in chaos; 
a first example, Feigenbaum’s universal numbers a and S will be discussed 
in Section 19.2. Dynamical chaos is not a rare phenomenon but ubiquitous in 
nature. It includes irregular shapes of clouds, coastlines, and other landscapes, 
which are examples of fractals (discussed in Section 19.3), and turbulent flow 
of fluids, water dripping from a faucet, and the weather. The damped, driven 
pendulum is among the simplest systems displaying chaotic motion. As a rule, 
we take the time as the independent dynamic variable. 

Necessary conditions for chaotic (defined for now as completely irregular 
so as to be unpredictable) motion in dynamical systems described by first- 
order differential equations are known to be 

• at least three dynamical variables; and 

• one or more nonlinear terms coupling two or several of them. 

That is, external noise, perturbations, or complexity are not needed to generate 
random behavior. 

As in classical mechanics, the space of the time-dependent dynamical vari¬ 
ables of a system of coupled differential equations is called its phase space. 
In such deterministic systems, trajectories in phase space are not allowed to 
cross. If they did, the system would have a choice at each intersection and 
would not be deterministic. If the phase space has only two dimensions, such 
nonlinear systems allow only for fixed points, defined as equilibrium points 
where iterations (or the motion) stop. An example is a damped pendulum 
[which can be described by a set of two first-order ordinary differential equa¬ 
tions (ODEs) involving two first-order derivatives co = 9, cb = f{u>, 0); that is, 
only two dynamic variables o>(i) and 0(f)]. In the undamped case, there are 
only periodic motion and equilibrium points (e.g., the turning points if there is 
a maximal angle). These systems are predictable and deterministic. If there is 
a time-dependent driving term (e.g., sin cot), we define a third variable cp — cot 
so that the coupled ODEs do not depend on time explicitly. This adds one more 
dimension to the phase space because cp = co = const. With three (or more) 
dynamic variables (e.g., damped, driven pendulum written as first-order cou¬ 
pled ODEs again), more complicated nonintersecting trajectories are possible. 
These can be shown to include chaotic motion and are called deterministic 
chaos. 

A central theme in chaos is the evolution of complex forms from the repeti¬ 
tion of simple but nonlinear operations; this is recognized as a fundamental 
organizing principle of nature. Although nonlinear differential equations 
are a natural place in physics for chaos to occur, the mathematically simpler 
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iteration of nonlinear functions provides a quicker entry to chaos theory, which 
we will pursue first in Section 19.2. 


Biographical Data 

Poincare, Jules Henri. Poincare, a French mathematician, was born in 
1854 in Nancy, France, and died in 1912 in Paris. Like Hilbert, he made major 
contributions to most branches of mathematics. In celestial mechanics he 
worked on the three-body problem, tides and the tidal origin of the moon. 
Independent of Lorentz, he developed Lorentz transformations from elec¬ 
trodynamics. 


19.2 The Logistic Map 


The nonlinear, one-dimensional iteration or difference equation 

X n +1 = gx n (l - x n ), x n e [0, 1]; 1 < /x < 4, (19.1) 

is sometimes called the logistic map. It is patterned after the nonlinear 
differential equation dx/dt = /xx(l — x), used by P. F. Verhulst in 1845 to 
model the development of a breeding population whose generations do not 
overlap. The density of the population at time n is x n . The positive linear term 
simulates the birth rate and the nonlinear negative term the death rate of the 
species in a constant environment that is controlled by the parameter /x. 

The quadratic function j], (x) = /xx(l — x ) is chosen because it has one 
maximum in the interval [0, 1] and is zero at the end points, j), (0) = 0 = j), (1). 
The maximum at x m = 1/2 is determined from /'Or) = 0; that is, 

ffam) = M(1 - 2 Xm) = 0, x m = i, (19.2) 

where / M (l/2) = /x/4. 

• Varying the single parameter /x controls a rich and complex behavior includ¬ 
ing one-dimensional chaos, as we shall see. More parameters or additional 
variables are hardly necessary at this point to increase the complexity. In a 
qualitative sense, the simple logistic map of Eq. (19.1) is representative of 
many dynamical systems in biology, chemistry, and physics. 

Starting with some x 0 e [0, 1] in Eq. (19.1), we get X\, then x%, defined 
as a cycle. Plotting //x) = /xx(l — x) along with the diagonal, straight vertical 
lines to show the intersections with the curve j], and horizontal lines to convert 
j], (x,;) = Xf+i to the next x-coordinate, we can construct Fig. 19.1. Choosing 
/x (= 2 in Fig. 19.1) and xo (= 0.1), the vertical line x = Xo meets the curve 
/, (x) in Xi (= 0.18), and a horizontal line from (xo, Xi) intersects the diagonal 
in (xi, Xi). The vertical line through (xi, Xi) meets the curve in X 2 (= 0.2952 in 
Fig. 19.1), etc. 
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Figure 19.1 

Cycle (* 0 , * 1 , • ■ •) f° r 
the Logistic Map for 
/x = 2, Starting Value 
jr 0 = 0.1 and Attractor 

x* = 1/2 



The Xi converge toward (0.5, 0.5), a fixed point. A fixed point is defined 
by j], (x) = x. Thus, at the fixed point labeled x*, the iteration stops so that 

Mx*) = iix*(l - x *) = x*, i.e., x* = 1 — 1/fi. (19.3) 

For any initial xo satisfying j], (.x'o) < x*, or 0 < xq < l//x, the a;,- converge 
to x *; such a fixed point is defined as stable. Such fixed points are also called 
attractor or sink. The interval (0, l//x) defines a basin of attraction for the 
fixed point x*. The attractor x* is stable provided the slope ,//(.%'*) < 1, or 
1 < jx < 3. This can be seen from a Taylor expansion of an iteration near the 
attractor 


X „+1 = Mx n ) = UX*) + f; L (x*)(x n -**) + ■■■, 


from which follows 


X n +1 ~ X ■ 

ry _ 

•Asfi JU 


f'ix*l 


upon dropping all higher order terms. Thus, if |//(a?*)| < 1, the next iterate 
x n +i lies closer to a:* than x n , implying convergence to and stability of the fixed 
point. However, if |//(a;*)| > 1, x n+ \ moves farther from a;* Ilian x„, implying 
instability. Given the continuity of f in /i, the fixed point and its properties 
persist when the parameter (here /x) is slightly varied. 

For // > landau < 0 or Xd > 1, it is easy to verify graphically or analytically 
that the Xi -> — oo. The origin x = 0 is a second fixed point. Since //(0) = 
/x(l — 2a';) | r= o = n > 1, the iterates move away from it. Such a fixed point 
is called a repellor. 

When 


/;«> = M( l-ar*) = 2-/i = -l 
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Figure 19.2 
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is reached for \± = 3, two fixed points occur, shown as the two branches in 
Fig. 19.2 as /i increases beyond the value 3. They can be located by solving 


x* 2 = MM* 2*)) = «*(1 - a?2*)[l - - X* 2 )] 


fora;|. Here, it is convenient to abbreviate f <]> (x) = j], (pc'), f <2> (■->") = .f), (j] L (x)) 
for the second iterate, etc. Now we drop the common x\ and then reduce 
the remaining third-order polynomial to second order by recalling that a fixed 
point of j), is also a fixed point of f (2> as j], (j), (x*)) = / M (x*) = x*. Therefore, 
x 2 = x* is one solution. Solving for the quadratic polynomial, we obtain 


0 = m 2 [ 1 - in + IK + 2^x*f - mK*) 3 ] - 1 
= O - i - /K)[iu +1 - Km + IK + m 2 K) 2 ]- 


The roots of the quadratic polynomial are 



which are the two branches in Fig. 19.2 for /i > 3 starting at x 2 = 2/3 at 
M = 3. Each x 2 is a point of period 2 and invariant under two iterations of 
the map f fi . The iterates oscillate between both branches of fixed points x 2 . 
A point x n is defined as a periodic point of period to for if / w Oo) = Xq 
but f (l \xo) ^ Xi) for 0 < i < n. Thus, for 3 < /i < 3.45 (Fig. 19.2) the stable 
attractor bifurcates into two fixed points x%. The bifurcation for // = 3, 
where the doubling occurs, is called a pitchfork bifurcation because of its 
characteristic (rounded Y) shape. A bifurcation is a sudden change in the 
evolution of the system, such as a splitting into two regions. 

As ii increases beyond 3, the derivative d/ (2) /dx decreases from unity to 
— 1. For m = 1 + V6 ~ 3.44949, which can be derived from 
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each branch of fixed points bifurcates again so that x£ — f (4> (x^) (i.e., it has 
period 4). For // = 1 + V6, these are a; 4 * = 0.43996 and x£ = 0.849938. With 
increasing period doublings it becomes impossible to obtain analytic solu¬ 
tions. The iterations are better done by computer, whose rapid improvements 
(computer-driven graphics, in particular) and wide distribution starting in the 
1970s and 1980s have accelerated the development of chaos theory. As the 
/x values where bifurcations occur become increasingly more closely spaced, 
the sequence of bifurcations continues with ever longer periods until it con¬ 
verges to /Too = 3.5699456..., where an infinite number of bifurcations occur. 
Near bifurcation points, fluctuations, rounding errors in initial conditions, etc. 
play an increasing role because the system has to choose between two pos¬ 
sible branches and becomes much more sensitive to small perturbations, 
a characteristic feature on the road to chaos. In the present case, for most 
lx > ix oo the x n never repeat, except for narrow periodic windows (that are 
unshaded in Fig. 19.2). The bands of fixed points x* begin forming a continuum 
(shown dark in Fig. 19.2): This is where chaos starts and what is defined as 
chaos. This increasing period doubling is the route to chaos for the logistic 
map that is characterized by a constant 8, called a Feigenbaum number. The 
first bifurcation occurs at fx\ = 3, the second at ix 2 = 3.44949, ..., and the 
ratios of spacings between the fi n can be shown to converge to 8: 

lim ~ = 8 = 4.66920161.... (19.4) 

n->oo fi n _|_i fx n 

From the bifurcation plot Fig. 19.2, we obtain 


(M2 - Mi)/(M3 - M2) = (3.45 - 3.00)/(3.54 - 3.45) = 4.97 


as a first approximation for the dimensionless 8. 

We recognize that each successive period-doubling bifurcation is a smaller 
replica of the sidewise cup shape, with twice the number of cups of the bifurca¬ 
tion just before it. We can measure the width of successive cups, calling them 
d n , going along a definite branch. The ratios of widths of these self-similar cup 
shapes can be shown to converge to another dimensionless quantity a: 

lim — = a = 2.5029.... (19.5) 

n^oo d n+ \ 

Using our earlier bifurcation points x£ = 0.849938 and x£ = 0.43996 and 
reading off Fig. 19.2 approximate values for the next bifurcation values x* 
from the lower branch, we obtain 


di 

di 

<h 


= 0.849938 - 0.43996 = 0.409978 
0.41 


0.16 


= 2.56 


0.41, d 2 


0.51 - 0.35 = 0.16, 


as a first approximation for a. Further details are discussed in Section 19.3. 
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The Feigenbaum numbers 5 and a are the same, and in this sense universal, 
for the route to chaos via period doublings for all maps with a quadratic 
maximum such as the logistic map. This is an example of order in chaos. 
Experience shows that its validity is even wider, including two-dimensional 
(dissipative) systems with twice continuously differentiable functions. 1 When 
the maps behave like \x—x m \ l+e near their maximum x m for some s between 0 
and 1, the Feigenbaum number will depend on the exponent e; thus, 5 (e) varies 
between 5(1) = 8 given in Eq. (19.4) for quadratic maps and 5(0) = 2 fore = 0. 2 
Solutions of (nonlinear) differential equations can often be analyzed in terms 
of discrete maps (i.e., cycles of iterates, fixed points, and bifurcations) that are 
generated by placing a tranverse plane into a trajectory that then intersects 
the plane in a series of points at increasing times. 


EXERCISES 

19.2.1 Show that x* = 1 is a nontrivial fixed point of the map x n+ i = 
x n exp[r(l — x n )\ with a slope 1 — r so that the equilibrium is stable if 
0 < r < 2 . 

19.2.2 Draw a bifurcation diagram for the exponential map of Exercise 19.2.1 
for r > 1.9. 

19.2.3 Determine fixed points of the cubic map x n +i = ax^ + (1 — a)x n for 
0 < a < 4 and 0 < x n < 1. 

19.2.4 Write the time-delayed logistical map x n+ i = fix n (l — x n -i) as a two- 
dimensional map, x n+ \ = nx n (l — ?/„), Un+ 1 = x n , and determine some 
of its fixed points. 

19.2.5 Show that the second bifurcation for the logistical map that leads to 
cycles of period 4 is located at /r = 1 + V6. 

19.2.6 Construct a nonlinear iteration function with Feigenbaum 5 in the 

interval 2 < 5 < 4.6692_ 

19.2.7 Determine the Feigenbaum 5 for (a) the exponential map of Exercise 
18.2.1, (b) some cubic map of Exercise 19.2.3, (c) the time-delayed 
logistic map of Exercise 19.2.4. 

19.2.8 Repeat Exercise 19.2.7 for Feigenbaum’s a instead of 5. 

19.2.9 Find numerically the first four points // for period doubling of the lo¬ 
gistic map and then obtain the first two approximations to the Feigen¬ 
baum 5. Compare with Fig. 19.2 and Eq. (19.4). 


iMore details and computer codes for the logistic map are given by G. L. Baker and J. P. Gollub, 
Chaotic Dynamics: An Introduction, 2nd ed. Cambridge Univ. Press, Cambridge, UK (1996). 

2 For other maps and a discussion of the fascinating history of how chaos became again a hot 
research topic, see D. Holton and R. M. May in The Nature of Chaos (T. Mullin, Ed.), Section 5, 
p. 95. Clarendon, Oxford, UK (1993); and Gleick’s Chaos (1987). 
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19.2.10 Find numerically the values fx where the cycle of period 1,3,4,5,6 
begins and then where it becomes unstable. 


3, /r = 3.8284, 

Check values. For period t’ 11 n'^ooo’ 

5, n — 3.7382, 

6, /x = 3.6265. 


19.2.11 Repeat Exercise 19.2.9 for Feigenbaum’s a. 


d 


19.3 Sensitivity to Initial Conditions and Parameters 


Lyapunov Exponents 


In Section 19.2, we described how, as we approach the period doubling 
accumulation parameter value = 3.5699... from below, the period n+ 1 of 
cycles (xq, X\, ..., x n ) with x n +i = Xq gets longer. The logistic map is defined 
to be chaotic at all values of /x where fixed points with cycles of infinite length 

occur. Thus, chaos starts with /Too = 3.5699_For most /x > /Xoo it is also 

easy to check numerically that the distances between neighboring iterates xq 
and xq + e, 


dn = l/ (n) Oo + £)- f n \xo)\, (19.6) 


grow as well for small e > 0. With chaotic behavior this distance increases 
exponentially with n oo; that is, d n /s = e' n , or 


X = - In 
n 


l/^Cro + e) - f^ n \x 0 )| 


(19.7) 


where X is the Lyapunov exponent. For s 0, we may rewrite Eq. (19.7) in 
terms of derivatives as 


A,(a;o) = - In 
n 


df^Kx o) 


dx 


1 n 

n U 


using the chain rule of differentiation for df ( "^(x)/d,x, where 

dfV\xo) _ df^ 
dx dx 


*=/n(*o) 


df„ 

dx 


= f'MffXQ) 


(19.8) 


(19.9) 


and /)' = dj], /dx, x\ — j], (xq), etc. Our Lyapunov exponent has been calculated 
at the point xq. To understand whether or not an attractor x* is chaotic, one 
needs to determine Xj for several starting points Xj near x* and calculate 
the average X. If the average turns out to be positive, the attractor (and its 
parameter /i) is defined to be chaotic. This is particularly relevant in higher 
dimensional dynamical systems in which the motion is often bounded so that 
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the d n cannot go to oo. In general, one point is not enough to determine X as a 
measure of the sensitivity of the system to changes in initial conditions. In such 
cases, we repeat the procedure for several points on the trajectory and average 
over them. This way, we obtain the average Lyapunov exponent for the 
sample. This average value is often called and taken as the Lyapunov exponent. 

The Lyapunov exponent X is a quantitative measure of chaos: A one¬ 
dimensional iterated function such as the logistic map has chaotic iterates 
(.Xo , X\ , ...) for the parameter /x if the average Lyapunov exponent is pos¬ 
itive for that value of /x (the shaded region in Fig. 19.2). For cycles of finite 
period, X is negative. This is the case for /x < 3, for /x < /Xqo, and even in the 
periodic window at /x ~ 3.627 inside the chaotic region of Fig. 19.2. A negative 
Lyapunov exponent does not necessarily correspond to periodicity of iterates. 
At bifurcation points, X = 0. For /x > /Xoo, the Lyapunov exponent is positive, 
except in the periodic windows where 1 < 0, and X grows with /x. In other 
words, the logistic map becomes more chaotic as the control parameter /x 
increases. 

In the chaos region of the logistic map there is a scaling law for the average 
Lyapunov exponent (that we do not derive here), 

A(/x) = A 0 (/x - ^ ln2/]nS , (19.10) 

where In 2/ In S ~ 0.445, <5 is the universal Feigenbaum number of Section 19.2, 
and Ao is a constant. This relation is reminiscent of a physical observable at 
a (second-order) phase transition. The exponent in Eq. (19.10) is a univer¬ 
sal number; the Lyapunov exponent plays the role of an order parameter, 
whereas /x — /x^ is the analog of T — T c , where T, is the critical temperature 
at which the phase transition occurs. 


Fractals 


In dissipative chaotic systems (but rarely in conservative Hamiltonian 
systems), often new geometric objects with intricate shapes appear that are 
called fractals. Fractals are irregular geometric objects that exist at many 
scales like cauliflowers so that their smaller parts resemble their larger parts. 
Intuitively, a fractal is a set that is (approximately) self-similar under magni¬ 
fication. The dimension is not a conventional positive integer. The clarification 
and definition of the concept of dimension is our next topic. 

We need a quantitative measure of dimensionality in order to describe 
fractals. Unfortunately, there are several definitions with usually different 
numerical values, none of which has become a standard. For strictly self¬ 
similar sets one measure suffices, of course. More complicated (e.g., only 
approximately self-similar) sets require more measures for their complete 
description. The simplest is the box-counting dimension of Kolmogorov 
and Hausdorff. For a one-dimensional set, cover the curve by line segments 
of length R; in two dimensions, cover the surface by boxes, squares of area 
R 2 ; in three dimensions, cover the volume by cubes of volume R’; etc. Count 
the number N(R) of boxes needed to cover the set. Letting R go to zero, we 
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Figure 19.3 

Construction of the 
Koch Curve by 
Iterations 



expect N to scale as N(R) ~ R d . Taking the logarithm, the box-counting 
dimension is defined as 


d = Hm[-lnN(K)/lnR]. (19.11) 

R-*-Q 

For example, in a two-dimensional space a single point is covered by one 
square so that In N(R) = 0 and d — 0. A finite set of isolated points also has 
dimension d = 0. For a differentiable curve of length L, N(R) ~ L/R as R —► 0 
so that d — 1 from Eq. (19.11), as expected. 

Let us now construct a more irregular set, the Koch curve. We start with 
a line segment of unit length in Fig. 19.3 and remove the middle one-third. 
Then we replace it with two segments of length one-third that form a triangle 
in Fig. 19.3. We iterate this procedure with each segment ad infinitum. The 
resulting Koch curve is infinitely long and is nowhere differentiable because 
of the infinitely many discontinuous changes of slope. At the uth step each 
line segment has length R n = 3~ n and there are NiR,,) = 4” segments. Hence, 
its dimension is d = In4/ ln3 = 1.26..., which is more than a curve but less 
than a surface. As the Koch curve results from iteration of the first step, it is 
strictly self-similar. 

A two-dimensional Koch-type construction conserves area and starts from 
a square, adding in the middle of each side a narrow triangle and cutting one 
out of the square next to it, repeating the process over and over to make it self¬ 
similar. Or starting from a rectangle, adding a small rectangle in the middle of 
each side and cutting one out next to it, repeating the construction at smaller 
scales over and over. 

For the logistic map the box-counting dimension at a period-doubling ac¬ 
cumulation point /Zoo is 0.5388..., which is a universal number for iterations 
of functions in one variable with a quadratic maximum. To see how this comes 
about, consider the pairs of line segments originating from successive bifur¬ 
cation points for a given parameter // in the chaos regime (see Fig. 19.2). 
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Imagine removing the interior space from the chaotic bands. When we go 
to the next bifurcation the relevant scale parameter is a = 2.5029... from 
Eq. (19.5). Suppose we need 2” line segments of length R to cover 2" bands. In 
the next stage, we need 2” +1 segments of length R/a to cover the bands. This 

yields a dimension d = — ln(2”/2 B+1 )/ In a = 0.4498_This crude estimate 

can be improved by taking into account that the width between neighboring 
pairs of line segments differs by 1/a (see Fig. 19.2). The improved estimate, 

0.543, is closer to 0.5388 _A more accurate analysis of the logistic map 

and other examples shows that when the fractal set does not have a strictly 
self-similar structure, the box-counting dimension depends on the box con¬ 
struction method. 

A set of attracting points in the phase space of the (dissipative) dynam¬ 
ics with noninteger dimension is called a strange attractor. Such strange 
attractors play a pivotal role in the theory of chaos. 

Finally, we turn to the beautiful fractals that are surprisingly easy to gener¬ 
ate and whose color pictures had considerable impact. For complex c = a+ib, 
the quadratic complex map involving the complex variable z = x + iy, 

z n+1 =z 2 n + c, (19.12) 

looks deceptively simple, but the equivalent two-dimensional map in terms of 
the real variables 


x n+ x = x 2 - yl +a, y n + i = 2 x n y n + b (19.13) 

reveals more of its complexity. This map forms the basis for some of 
Mandelbrot’s beautiful multicolor fractal pictures (see Mandelbrot and 
Peitgen in Additional Reading), and it has been found to generate intricate 
shapes for various c ^ 0. For example, the Julia set of a map z n+ \ = F(z n ) 
is defined as the set of all its repellors or periodic points. Thus, it forms the 
boundary between initial points of a two-dimensional iterated map leading to 
iterates that diverge and those that stay within some finite region of the com¬ 
plex plane. For the case c = 0 and F(z) = z 2 , the Julia set can be shown to 
be a circle about the origin of the complex plane. However, just by adding a 
constant c ^ 0, the Julia set becomes fractal. For example, for c = — 1 one 
finds a fractal necklace with infinitely many loops (see Devaney in Additional 
Reading). 

While the Julia set is drawn in the complex plane, the Mandelbrot set is 
constructed in the two-dimensional parameter space c = (a, b) = a + bi. It 
is constructed as follows. Starting from the initial value Zq = 0 = (0, 0), one 
searches Eq. (19.12) for parameter values c so that the iterated {z n } do not 
diverge to oo. Each color outside the fractal boundary of the Mandelbrot set 
represents a given number of iterations, m, needed for the z„ to go beyond a 
specified absolute (real) value R, \z m \ > R > \z m -i\. For real parameter value 
c — a, the resulting map x n+i = x 2 + a is equivalent to the logistic map with 
period-doubling bifurcations (see Section 19.2) as a increases on the real axis 
inside the Mandelbrot set. 
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Fractals are ubiquitous in nature, as seen in eroded steep mountain aretes, 
coastlines, and cumulus cloud shapes because nonlinear iterations occur in 
the dynamics shaping these formations. 


EXERCISES 

19.3.1 Use a computer with BASIC or FORTRAN or symbolic software such 

as Mathematica or Maple or a website to obtain the iterates Xj of an 
initial 0 < x<) < 1 and for the logistic map. Then calculate the 

Lyapunov exponent for cycles of period 2,3,... of the logistic map for 
2 < // < 3.7. Show that for /i < /Xoo the Lyapunov exponent /. = 0 at 
bifurcation points and negative elsewhere, whereas for /i > it is 
positive except in periodic windows. 

Hint. See Fig. 9.3 of Hilborn in Additional Reading. 

19.3.2 Consider the map x n +i — F(x rl ) with 

I a + bx, x < 1, 
c + dx, x > 1, 

for b > 0 and d < 0. Show that its Lyapunov exponent is positive when 
b > 1, d < — 1. Plot a few iterations in the (x n+ 1 , x n ) plane. 


19.4 Nonlinear Differential Equations 


In Section 19.1, we mentioned nonlinear differential equations (NDEs) as 
the natural place in physics for chaos to occur but continued with the sim¬ 
pler iteration of nonlinear functions of one variable (maps). Here, we briefly 
address the much broader area of NDEs and the far greater complexity in 
the behavior of their solutions. However, maps and systems of solutions of 
NDEs are closely related. The latter can often be analyzed in terms of dis¬ 
crete maps. One prescription is the Poincare section (or map) of a system of 
NDE solutions. Placing a plane transverse into a trajectory (of a solution of a 
NDE), it intersects the plane in a series of points at increasing discrete times 
[e.g., in Fig. 19.4, y(t\ )) = (xi, y{), (xx, yi), ...], which are recorded and 

graphically or numerically analyzed for fixed points, period-doubling bifurca¬ 
tions, etc. This method is useful when solutions of NDEs are obtained numer¬ 
ically in computer simulations so that one can generate Poincare sections at 
various locations and with different orientations with further analysis leading 
to two-dimensional iterated maps 


Xn +1 — F\ (x n , Un), Vn+l — -F>(,'r n , v/„) (19.14) 

stored by the computer. Extracting the functions Fj analytically or graphically 
is not always easy, though. 

Let us start with a few classical examples of NDEs. 
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Figure 19.4 

Schematic of a 
Poincare Section 



Bernoulli and Riccati Equations 


Bernoulli equations are also nonlinear, having the form 


y'(x) = p(x)y(x) + q(x)[y(x)] n , 


(19.15) 


where p and q are real functions and n . ^ 0,1 to exclude first-order linear ODEs. 
They are an example of a nonlinear ODE that can be solved analytically, but 
being one-dimensional, they do not generate chaos and their solutions do not 
diverge exponentially. If we substitute 

u{x) = [y{x)] l ~ n , (19.16) 


then Eq. (19.15) becomes a first-order linear ODE 

v! — (1 — n)y~ n y' = (1 — ri)[p(x)u(x) + q(x)], (19.17) 


which we can solve as described in Section 8.2. 


EXAMPLE 19.4.1 


Bernoulli NDE Solve y = y — 2 y 2 for y(t = 0) = 1/3, an NDE that models 
animal (and human) population growth similar to Verhulst’s (or the logistic) 
NDE. Here, the quadratic term with its negative sign prevents the popula¬ 
tion from exploding. Neglecting it, one obtains exponential growth y(t) = e 1 
(Malthus’s law) with increasing time. Substituting u = 1/y, u = — y/y 1 into 
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EXAMPLE 19.4.2 


the NDE \ = - —2, we obtain u — 2 —u. Separating variables and integrating 
yields * V 

/ u du ft 

-= — / dt = —t + In C. 

u — 2 J 

Exponentiating, this result gives u = 2 + Ce~\ or y(t) = 2 . Upon varying 

C the solutions do not diverge exponentially, and there is no chaos. The initial 
condition t/(0) =1/3 fixes C — 1. This solution y(t) = 9 increases from 
1/3 at t = 0 to 1/2 for t —*■ oo. ■ 

Riccati equations 3 are quadratic in y(x): 

y' = p(x)y 2 + q(x)y + r(x), (19.18) 

where p ^ 0 to exclude linear ODEs and r ^ 0 to exclude Bernoulli equations. 
There is no general method for solving Riccati equations. However, when a 
special solution yo(x) of Eq. (19.18) is known by guess or inspection, then the 
substitution y = y 0 + u leads to the Bernoulli equation for u(x'), 

v! — pu 2 + (2py 0 + q)u, (19.19) 

because substitution of y = yo + u into Eq. (19.18) removes r(x) from Eq. 
(19.18). 

Finally, we prove the following theorem: The substitution y = —z'/pz(x) 
transforms the Riccati NDE into the homogeneous linear second-order ODE 

z" - z' + prz(x) = 0. (19.20) 

We show this by substituting y' expressed in terms of z into the NDE 

z n z , p , Z f2 z ,2 qz , 

y — -1- 2 -1- 2 — - 2 -^ r > 

pz JXZ pz z pz “ pz 

which yields 

z" = z' ( q + — ) — prz. 

V pJ 

If this ODE cannot be solved analytically, it may be solved by the power series 
method. 

Riccati NDE Solve y' = 1 /2 — x 2 y + 2 xy 2 with initial condition y(0) = 1. 

We start by guessing a solution (there are no rules to find one—-just trial 
and error), namely y = x/2, and verify it by 1/2 = 1/2 — x 2 /2 + 2x(x 2 /A). 
Then we substitute y = f + u, y' = l + v! into the Riccati NDE to obtain the 


3 Riccati’s NDE has been used for solving the Schrodinger equation by S. B. Haley, Am. J. Phys. 
65 , 237 ( 1997 ). 
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Bernoulli NDE 


v! — x 2 u + 2xiC. 

We solve this by setting v = l/u, v' = —v!/iP. This gives the ODE 

— v' = x 2 v + 2x. 


We solve first the homogeneous ODE — v' — x 2 v by separating variables and 
integrating 

r dv r x , , 

/ — = — x 2 dx — —x 3 /3 + In C. 

Exponentiating gives v = Ce~ x3/3 . Now we vary the constant C -* C(x) and 
substitutes = C(x)e~ x /3 into the inhomogeneous ODE, getting C’ — —2xe x /3 . 
Hence, v — e~ x3/3 (c — 2 f x xe x3/3 dx), u = l/u, and finally y = x/2 + u. The 
initial condition fixes the integration constant c — 1 from y( 0) = 1 = 1/c. ■ 


Just as for Riccati equations, there are no general methods for 
obtaining exact solutions of other nonlinear ODEs. It is more important 
to develop methods for finding the qualitative behavior of solutions by 
numerical integration, where chaos occurs, etc. In Chapter 8, we described 
that power series solutions of ODEs exist except (possibly) at essential singu¬ 
larities. The locations of the latter are directly given by the properties of the 
coefficient functions of the ODE. Such a local analysis provides us with the 
asymptotic behavior of solutions as well. 


Fixed and Movable Singularities, Special Solutions 


Solutions of NDEs also have such singular points that are independent of the 
initial or boundary conditions and are called fixed singularities. In addition, 
they may have spontaneous or movable singularities that vary with the initial 
or boundary conditions. They complicate the (asymptotic) analysis of NDEs. 


EXAMPLE 19.4.3 


Movable Singularity This point is illustrated by a comparison of the linear 
ODE 


= 0 , 


x - 


(19.21) 


which has an obvious regular singularity at x — 1, with the NDE y' — y 2 . 
Both have the same solution with initial condition y{ 0) = 1, namely y(x) = 
1/(1 — x). For 2/(0) = 2, though, the pole in the (obvious, but check) solution 
y(x) — 2/(1 — 2.x) of the NDE has moved to x = 1/2. More generally, y(x ) = 
2/o/(l — yox) is a solution of ?/ = y 2 with //(()) = y 0 , which in this case shows 
how the singularity moves with the initial condition. ■ 


For a linear second-order ODE we have a complete description of (the 
asymptotic behavior of) its solutions when (that of) two linearly independent 
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solutions are known. For NDEs there may still be special solutions whose 
asymptotic behavior is not obtainable from two independent solutions. This 
is another characteristic property of NDEs, which we illustrate again by an 
example. 


EXAMPLE 19.4.4 


Special Solution The class of solutions of the NDE y" = yy' /x that depends 
on two parameters to satisfy initial conditions y(Q) = y$ and y' (0) = y ' t) , for 
example, is given by 


y(x) = 2ci tan(cr In x + c 2 ) — 1, (19.22) 

where c,; are integration constants. An obvious (check it) special solution is 
y = c-.\ — constant, which cannot be obtained from Eq. (19.22) for any choice 
of the parameters Ci, c 2 for C 3 ^ —1. Note that using the substitution x = e 1 , 
Y(t) = y(e') so that xdy/dx — dY/dt, we obtain the ODE Y" = Y'(Y +1). This 
ODE can be integrated once to give Y' = |F 2 + Y + c, with c = 2 (cf + 1/4) an 
integration constant, and again according to Section 8.2 to lead to the solution 
of Eq. (19.22). ■ 

Autonomous Differential Equations 

Differential equations that do not explicitly contain the independent vari¬ 
able taken to be the time t here are called autonomous. Verhulst’s NDE 
y = dy/dt — ny(\ — yy), which we discussed briefly in Section 19.2 as mo¬ 
tivation for the logistic map, is a special case of this wide and important class 
of ODEs . 4 For one dependent variable y(t), they can be written as 

y=f(y) (19.23a) 

and for several dependent variables as a system 

Vi = Myi, 2 / 2 , - - ■, Vn), i=l,2,...,n (19.23b) 

with sufficiently differentiable functions /, fj. A solution of Eq. (19.23a) is a 
curve or trajectory y(t ) for n= 1 and, for n > 1 , a trajectory (y\(t), yiit ),..., 
y n (t )) in phase space. As discussed in Section 19.1, two trajectories cannot 
cross because of the uniqueness of the solutions of ODEs. Clearly, solutions 
of the algebraic system 

MVh 2 / 2 ,.... yn) = 0, i = 1, 2,..., n (19.24) 

are special points in phase space, where the position vector (y h y 2 , ..., y„) 
does not move on the trajectory; they are called critical (or fixed) points, just 
like fixed points for nonlinear maps. A local analysis of solutions near critical 
points leads to an understanding of the global behavior of the solutions. First, 
let us discuss a simple example. 


4 Solutions of nonautonomous equations can be more complicated. 
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Verhulst’s NDE For Verhulst’s ODE y = f(y) = fiyd — y) — 0 with /i > 0 
gives y = 0 and y = 1 as the critical points; for the logistic map y = 0 
is a repellor as df/dy = (i > 1 , whereas x* = 1 — l//x is an attractor as 
\df(x*)/dy\ = |2 — y\ < 1 for y, < 3. 

At y = 0 we expand y = /(0) + f'(0)y +■■■ = yy+■■■ so that y moves 
to the right for y > 0 and to the left for y < 0. This behavior 
characterizes y — 0 as a repellor. At y = 1, we expand 



y = /( 1 ) + faxy —!) + ••• = -y(y -!) + ••• 


so that y moves to the right for y < 1 and to the left for y > 1. This behavior 
describes an attractor. 

This local analysis near y = 0 suggests neglecting the y 2 term and solving 
y = ivy instead. Integrating / dy/y = jit + Inc gives the solution y(t) = ceJ a , 
which diverges as t —> oo so that y = 0 is a repellor. Similarly, at y — 1, 
/ dy/(l — y) = yt — lnc leads to y(t) = 1 — ce _Mi ->■ 1 for t -» oo. Hence, y = 1 
is an attractor. Because the NDE is separable, its general solution is given by 



Hence, y(t ) = ce^ 1 /[1+ce^ 1 ] for t -* oo converges to unity, thus confirming the 
local analysis. No chaos is formd here because the NDE has only one dynamical 
variable, and as c is varied there is no exponential divergence. ■ 

This example motivates us to examine the properties of fixed points in 
more detail. In general, it is easy to see that 

• in one dimension fixed points yi with /(?/,) = 0 divide the y-axis into 
dynamically separate intervals because, given an initial value in one of the 
intervals, the trajectory y(t ) will stay there since it cannot go beyond either 
fixed point where y — 0 . 

if/'Q/o) > 0 at the fixed point y 0 where f(yo) — 0 , then at y 0 + s for s > 0 
sufficiently small, y = f'(yo)s + 0(s 2 ) > 0 in a neighborhood to the right of v/o 
so that the trajectory y(t) keeps moving to the right, away from the fixed point 
y<). To the left of y 0 , y = — f(jJo)s + 0(s 2 ) < 0 so that the trajectory moves 
away from the fixed point here as well. Hence, 

• a fixed point [with f(yo) = 0] at y 0 with /'(yo) > 0 as shown in Fig. 19.5a 
repels trajectories (i.e., all trajectories move away from the critical point): 

It is a repellor. Similarly, we see that 

• a fixed point at yo with f(jjo) < 0 as shown in Fig. 19.5b attracts trajectories 
(i.e., all trajectories converge toward the critical point y 0 ): 

It is a sink. 
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Figure 19.5 

Fixed Points, (a) 
Repellor; (b) Sink 



Let us now consider the remaining case in which also /' (yo) = 0. Let us assume 
f"(.y o) > 0. Then at yo + e to the right of fixed point yo, 

V — f'Xyo)s 2 /2 + C/e 3 ) > 0 

so that the trajectory moves away from the fixed point there, whereas to the 
left it moves closer to y 0 . This is defined as a saddle point in one dimension. 
For /"(y 0 ) < 0, the sign of y is reversed so that we deal again with a saddle 
point with the motion to the right of yo toward the fixed point and at left away 
from it. Let us summarize the local behavior of trajectories near such a fixed 
point y 0 : We have 

• a saddle point at yo when /(y 0 ) = 0, and f'(jjo) = 0, as shown in Figs. 
19.6a and 19.6b corresponding to the cases in which (a) /"(yo) > 0 and 
trajectories on one side of the critical point converge toward it and diverge 
from it on the other side (■ ■ ■ -* ■ —»■ • • ■) and (b) /"(yo) < 0. Here, the 
direction is simply reversed compared to (a). Fig. 19.6(c) shows the cases 
in which /"(yo) = 0. 

So far, we have ignored the additional dependence of /(y) on one or more 
parameters, such as /i for the logistic map. When a critical point maintains its 
properties qualitatively as we adjust a parameter slightly, we call it structurally 
stable. This definition makes sense in practice: Structurally unstable objects 
are unlikely to occur in reality because noise and other neglected degrees of 
freedom act as perturbations on the system that effectively prevent unstable 
points from being observed. Let us now examine fixed points from this point of 
view. Upon varying such a control parameter slightly, we deform the function 
/, or we may just shift / up or down or sideways in Fig. 19.5. This will slightly 
move the location yo of the fixed point with /(yo) = 0 but maintain the sign 
of /'(yo). Thus, both attractors and repellors are structurally stable, 
whereas a saddle point in general is not. For example, shifting / in Fig. 19.6a 
down creates two fixed points, one a sink and the other a repellor, and removes 
the saddle point. However, saddle points mark the border between different 
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Figure 19.6 
Saddle Points 



types of dynamics and are useful and meaningful for the global analysis of the 
dynamics. We are now ready to consider the richer, but more complicated, 
higher dimensional cases. 


Local and Global Behavior in Higher Dimensions 

In two or more dimensions we start the local analysis at a fixed point 

{Vi, vl, ■■■, Vr) with iji = y%, ..., y°) = 0, 


using the same Taylor expansion of the j) in Eq. (19.23b) as for the one¬ 
dimensional case. Retaining only the first-order derivatives, this approach lin¬ 
earizes the coupled NDEs of Eq. (19.23b) and reduces their solution to linear 
algebra as follows. We abbreviate the constant derivatives at the fixed point 
as a matrix F with elements 


fij — 


dfi 
3 Vi 




(19.25) 


In contrast to the standard linear algebra in Chapter 3, however, F is neither 
symmetric nor Hermitian in general. If we shift the fixed point to the origin 
and call the shifted coordinates x- L — y-, — >/■, then the coupled NDEs of 
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EXAMPLE 19.4.6 


Eq. (19.23b) become 

Xi = fijXj, (19.26) 

that is, coupled linear ODEs with constant coefficients. Here, a summation over 
the repeated index j is understood. This expansion is the linearized dynamics 
around one fixed point in the multidimensional space. We solve Eq. (19.26) 
with the standard exponential Ansatz (Sections 8.3 and 15.9) 

Xi(t) = Cije Xjt , (19.27) 

with constant exponents Xj and a constant matrix C of coefficients cy so 
that c j — (Cij , i = 1, 2,...) forms the jth column vector of C. Substitut¬ 
ing Eq. (19.27) into Eq. (19.26) yields a linear combination of exponential 
functions, 

djXje^ 1 = f ik c kj e Xjt , (19.28) 

which are independent if kj ^ kj. This is the general case on which we focus. 
Degeneracies in which two or more X are equal require special treatment, 
as in the case of saddle points in one dimension. Comparing coefficients of 
exponential functions with the same exponent yields the linear eigenvalue 
equations 

fi k c k j — XjCij or F Cy, = XjCj, (19.29) 

where the repeated index k on the left-hand side is summed, but j on the right- 
hand side is held fixed. A nontrivial solution comprising the eigenvalue Xj and 
eigenvector c j of the homogeneous linear equations (19.29) requires Xj to be 
a root of the secular polynomial (compare with Section 3.5) 

det(F — A 1) = 0. (19.30) 

Equation (19.29) states that C diagonalizes F so that we can write Eq. (19.29) 
as 

C- J FC = [X h X 2 ,...], (19.31) 

a diagonal matrix with diagonal matrix elements it, in the notation of Chapter 3. 
In the new, but generally nonorthogonal, coordinates defined as C£ = x we 
have a characteristic exponent for each direction iy, as f y = Xjtjj, where 
the Xj play the role of f'(yu) in the one-dimensional case. The X are complex 
numbers in general. (There are no other directions in the solutions.) This is 
seen by substituting x = C£ into Eq. (19.26) in conjunction with Eqs. (19.29) 
and (19.31). Thus, this solution represents the independent combination of one¬ 
dimensional fixed points, one for each component of £ and each independent of 
the other components. In two dimensions for X\ < 0 and A. 2 < 0, we have a sink. 

Stable Sink The coupled ODEs 

x — —x, y — —x — 3 y 

have an equilibrium point at the origin. The solutions have the form 

x(t) = y(t) = c 2 ie Xlt + c 2 2 <^ 2( 
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so that the eigenvalue /.-[ = — 1 results from k\C\\ = — cn, and the solution is 
x — c nr -1 . The determinant of Eq. (19.30), 

— 1 — A. 0 

_1 —3 — k = (l + *)(3 + A.) = 0, 

yields the eigenvalues = — 1, X 2 = — 3. Because both are negative, we have 
a stable sink at the origin. The ODE for y gives the linear relations 

^lC21 = —Cll — 3 c 2 i = —C21 , k 2 C 22 = —3C22, 


from which we infer 2 C21 = —Cnorc2i = —cn/ 2 . Because the general solution 
will contain two constants, it is given by 

x(t) = c n e~\ y(t ) = - ‘ye - * + c 2 2 e -3 *. 

As the time t -* 00, we have y ~ —x/2 and x — »■ 0 and y —*■ 0 , whereas for 
t -x —00, y ~ a; 3 and a:, y — »■ ±00. The motion toward the sink is indicated by 
arrows in Fig. 19.7. To find the orbit we eliminate the independent variable t 
and find the cubics 


X C22 3 

y= —o + -3-^ • 

2 cfj 


Figure 19.7 
Stable Sink 
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EXAMPLE 19.4.7 


When both A are >0, we have repellor. In this case, the motion is away from the 
fixed point. However, when the A have different signs we have a saddle point 
(i.e., a combination of a sink in one dimension and a repellor in the other). 


Saddle Point The coupled ODEs 

x = —2 x—y, y=—x + 2 y 


have a fixed point at the origin. The solutions have the form 

x(t) = c n e Xlt + ci 2 e X2t , y(t) = c 2 ie Xlt + c 22 e XoJ . 
The eigenvalues A = ± v ; 5 are determined from 


-2 - A 

-1 


-1 

2 — A 


= A 2 — 5 = 0. 


Substituting the general solutions into the ODEs yields the linear equations 


AiCn = —2cn — c 2 i = V5cii, A 1 C 21 = —Cn + 2 c 2 i = V5c2i, 

A2C12 = — 2 ci 2 — C22 = — \/ 5 Ci 2 , A2C22 = —C12 + 2 c 22 = — VEc 22 , 


or 

(V 5 + 2)cn = -C 21 , (\/5 - 2 )ci 2 = c 22 , 

(VE — 2)c 2 l = — C 11 , (VE + 2)c 22 = C 12 , 

sothatc 2 i = —(2 + V 5 )cn, c 22 = (\/5 — 2)ci 2 . The family of solutions depends 
on two parameters cu, C \ 2 . For large time t -> 00, the positive exponent 

prevails and y ~ — (V5 + 2)x, whereas for t —> — 00 we have y = (VE — 

2)x. These straight lines are the asymptotes of the orbits. Because — (V5 + 2) 
(\/5 — 2) = —1, they are orthogonal. We find the orbits by eliminating the 
independent variable t as follows. Substituting the c 2 j, we write 

y — —2x - 75(c n e v ' 5( - ci 2 e _v ^ J ) so that — _ Cn gV5 ( _|_ c\ 2 e~^ 1 . 

V5 

Now we add and subtract the solution x(t) to get 

~^=(y+ 2x) + x= 2c V2 e~' Jlt , -^=Q/+ 2x) - x = - 2 c n e' /5 *, 

V5 V5 

which we multiply to obtain 

-Q/ + 2 a ;) 2 — x 2 = — 4 ci 2 Cn = const. 

5 

The resulting quadratic form y 2 + 4xy — x 2 = const, is a rotated hyper¬ 
bola because of the negative sign and the mixed term. The hyperbola is ro¬ 
tated because its asymptotes are not aligned with the x, y -axes (Fig. 19.8). 
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Its orientation is given by the direction of the asymptotes that we found ear¬ 
lier. Alternatively, we could find the direction of minimal distance from the 
origin proceeding as in Examples 1.5.3 and 1.5.4. That is, setting the x and y 
derivatives of / + Ag = x 2 + y 2 + A (y 2 + 4 xy — x 2 ) equal to zero, where 
A is the Lagrange multiplier for the hyperbolic constraint. The four branches 
of hyperbolas correspond to the different signs of the parameters c u and Cyi- 
Figure 19.8 is plotted for the cases Cn = ±1, Ci 2 = ±2. ■ 

This type of behavior generalizes to higher dimensions. 

However, a new kind of behavior arises for a pair of complex conjugate 
eigenvalues A 2 = p±A<:.Ifwe write the complex solutions £ 12 = exp(pf±i/rf) 
in real variables §± = (£i ± § 2 )/2 upon using the Euler identity exp (Ax) = 
cos x + i sin x (see Sections 3.5 and 6 .1), 

£+ = exp(pOcos(/rO, = exp(pi)smQcf) (19.32) 

describe a trajectory that spirals inward to the fixed point at the origin for 
p < 0, a spiral sink, and spirals away from the fixed point for p > 0, a spiral 
repellor. 
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EXAMPLE 19.4.8 


Spiral Fixed Point The coupled ODEs 

x = —x + 3 y, y = — ?>x + 2 y 
have a fixed point at the origin and solutions of the form 

x(t) = c n e Xlt + ci 2 e l2t , y(t ) = c 2 ie M + c 2 2 e loJ . 

The exponents A 12 are solutions of 
-l-A 3 

= (1 + A.)(A, — 2) + 9 = 0, 


or a 2 — /. + 7 = 0. The eigenvalues are complex conjugate, X = 1/2 ± ij 27/2, 
so that we deal with a spiral fixed point at the origin (a repellor because 
1/2 > 0). Substituting the general solutions into the ODEs yields the linear 
equations 


or 


Men = —cn + 3 c 2 i, 
^-202 = —Cl2 + 3 c 22 , 

(M + l)cn = 3 c 2 i, 
{X 2 + l)ci 2 = 3 c 22 , 


MC21 = —3cn + 2 c 2 i, 
X 2 c 22 = — 3ci 2 + 2c 22 , 

(Ai — 2)c 2 i = — 3cn, 
(X 2 — 2 )c 22 = —3ci 2 , 


which, using the values of A.i > 2 , imply the family of curves 

x(t) = e s/2 (cne*' /2 ™ /2 + ci 2 e“ iv/2f * /2 ), 

y(t) = | + ^^(cu^/ 8 - crze-^ 2 ) 

that depend on two parameters Cn, Ci 2 . To simplify, we can separate real and 
imaginary parts of x(f) and y(i) using the Euler identity e lx — cos x + i sin x. 
It is equivalent, but more convenient, to choose Cn = cq 2 = c/2 and rescale 
t — »■ 2t so that with the Euler identity we have 

QQ ^27 

x(t) = ce l cos (V27t), y(f) = - -— ce l sin (V27t). 

di U 

Here, we can eliminate / and find the orbit 

x2 + \{ v -t) ={ce ‘ f - 

For fixed t this is the positive definite quadratic form x 2 —xy+y 2 — const, (i. e., 
an ellipse). However, there is no ellipse in the solutions because t is not fixed. 
Nonetheless, it is useful to find its orientation. We proceed as in Examples 1.5.3 
and 1.5.4. With A the Lagrange multiplier for the elliptical constraint, we seek 
the directions of maximal and minimal distance from the origin, forming 

f{x, y) + A gix, y) = x 2 + y 2 + A(x 2 -xy+ ?/ 2 ) 
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Figure 19.9 
Spiral Point 



and setting 


9(/+Ag) 

dx 


— 2x + 2Ax — Ay = 0, 


From 


9(/ + Aff) 

3 y 


2y + 2Ay — Ax = 0. 


2(A + Y)x = Ay, 2(A + 1 )y = Ax 
we obtain the directions 

x A 2(A +1) 
y= 2(A + 1) = A ’ 

or A 2 + | A + | =0. This yields the values A = —2/3, —2 and the directions 
y — ±x. In other words, our ellipse is centered at the origin and rotated by 
45°. As we vary the independent variable t, the size of the ellipse changes so 
that we get the rotated spiral shown in Fig. 19.9 for c = 1. ■ 


0 


EXAMPLE 19.4.9 


In the special case in which p — 0, the circular trajectory is called a cycle. 

Center or Cycle The undamped linear harmonic oscillator ODE x + w 2 x = 0 
can be written as two coupled ODEs: 
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Figure 19.10 
Center 



Integrating the resulting ODE xx + yy = 0 yields the circular orbits x 2 + 
y 2 = const., which define a center at the origin and are shown in Fig. 19.10. 
The solutions can be parameterized as x = R cos t, y = R sin t, where It is 
the radius parameter. They correspond to the complex conjugate eigenvalues 
A .^2 = ±ico . We can check them if we write the general solution as 

x(t) = c n e Xlt + c 12 e X2t , y(t ) = c 2 ie Xlt + c 22 e l2t . 

Then the eigenvalues follow from 

—X —co 0 „ 

= x 2 + eo 2 = 0 . m 

co —X 


For the driven damped pendulum when trajectories near a center (closed 
orbit) are attracted to it as time goes on, this closed orbit is defined as a limit 
cycle representing periodic motion for autonomous systems. A damped pen¬ 
dulum usually spirals into the origin (the position at rest); that is, the origin 
is a spiral fixed point in its phase space. When we turn on a driving force, the 
system formally becomes nonautonomous because of its explicit time depen¬ 
dence but also more interesting. In this case, we can call the explicit time in 
a sinusoidal driving force a new variable cp, where coq is a fixed rate, in the 
equation of motion 

<w + yco + sind = f sirup, m = 0, ip = co 0 t. 
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Then we increase the dimension of our phase space by one (adding one vari¬ 
able <p) because <p = coo — const., but we keep the coupled ODEs autonomous. 
This driven damped pendulum has trajectories that cross a closed orbit in 
phase space and spiral back to it: It is called a limit cycle. This happens for a 
range of strength f of the driving force, the control parameter of the system. 
As we increase /, the phase space trajectories go through several neighboring 
limit cycles and eventually become aperiodic and chaotic. Such closed limit 
cycles are called Hopf bifurcations of the pendulum on its road to chaos af¬ 
ter the mathematician E. Hopf, who generalized Poincare’s results on such 
bifurcations to higher dimensions of phase space. 

Another classic attractor is quasiperiodic motion, such as the trajectory 

x(t ) = Ai sin(tt>i£ + &0 + A 2 sin(co 2 t + b 2 ), (19.33) 

where the ratio w i/co 2 is an irrational number. In the phase plane (x, y — x) the 
orbit forms a closed figure. Such combined oscillations in Eq. (19.33) occur 
as solutions of a damped anharmonic oscillator (Van der Pol nonautonomous 
system) 

x+ 2yx + co\x + fix 3 = /cos(&>i£). (19.34) 

The importance of the ratio of frequencies being rational is illustrated with 
the closed curve (a: = sin I, y = cos(3f)) in Fig. 19.11, whereasforthe irrational 
ratio in Fig. 19.12 the curve (x = sin /, y = cos(V 2 £)) stays open, no matter 
how large the time t grows. 

Figure 19.11 
The Curve 

(x = sinf,cos(3f)) 

Is Closed 
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Figure 19.12 

The Curve (at = 
sint, cos ( a/20) Is 
N ot Closed 



In three dimensions, when there is a positive characteristic exponent and 
an attracting complex conjugate pair in the other directions, we have a spiral 
saddle point as a new feature. Conversely, a negative characteristic exponent 
in conjunction with a repelling pair also gives rise to a spiral saddle point, where 
trajectories spiral out in two dimensions but are attracted in a third direction. 
This is illustrated in Example 19.4.10 and shown in Fig. 19.13. 


EXAMPLE 19.4.10 


Repelling Spiral Saddle Point The coupled ODEs 

x = y, y = z, z = —x 


have the general solutions 

x = eye*'*, y=J2 c 2j e kjt , z=J2 c 3j^ jt - 

j j j 

Substituting them into the ODEs yields the linear relations 


Xj c \j — c 2 jt ^j c 2j — C3j, XjC3j — Cij 
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Figure 19.13 


Repelling Spiral 
Saddle Point 



and the eigenvalue condition 

-1 1 


0 

0 -X 1 
-1 0 —X 


= —X a 


1 = 0. 


The exponents of the solution are given by the eigenvalues 

*t = -l, 1 2 ,3 = e ±?ri/3 = 

which correspond to a spiral repellor for positive time. The linear relations 
imply 


C21 = A1C11 = —C11, C22 = A.2C12, C23 = ^3Cl3, 

C31 = ^ 1^21 = —C 21 = Cll, ^lC31 = —Cll = — C 3 I 1 
^2C32 = —Cl2, ^-3C33 = —Cl3- 

Therefore, the solutions 

x(t ) = c n e~ l + Cl2e a/2+iV3/2y + c ^ e (i/ 2 -Wz /2 


1 V3 

2/CO = -cue~ l + C 12 1 - + i— \e' 


,(l/2W3/2)t 


+Ci3 ’ l +i2 r je 


(l/2-iV3/2)t 


z(t) = cue- 1 - — c '!_g( 1 /2+?V5/2)< 


C 13 (l/2-iV3/2y 

1 

2 0 2 
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depend on three parameters cn, C12, C13. To simplify them, we use the Euler 
identity and keep the real part only. This yields the solution 


x = c n e 1 + (C 12 + Ci 3 )e (/2 cos 




y = -c n e 1 + -(C 12 + ci 3 )e i/2 cos ( - V3 1 ) + —(ci 2 - ci 3 )e t/2 sin 


y/3. 


iV3* 


V3 

z — cue~ l + (C 12 + c 13 )e t/2 cos ( ^V3t )-— (ci 2 - ci 3 )e * /2 sin 




which is plotted in Fig. 19.13 for cn = 1, C 12 + C 13 = 1, C 12 — C 13 = 1. For 
t — > — 00, only the e~ l terms survive so that y ~ —x, z ~ x is the straight line 
sectioninFig. 19.13 ending near the origin. Fort —» 00, only the terms involving 
e i/2 survive so that (y — x/2')/(z — x) ~ — 1 , which is the plane z = y — Sx/2. 
Between these asymptotic limits we recognize the spiral behavior that changes 
with the parameters c\j of the curves. The asymptotes do not depend on the 
parameters. ■ 

Of course, such spiral sinks or saddle points cannot occur in one dimension, 
but we might ask if they are stable when they occur in higher dimensions. An 
answer is given by the Poincare-Bendixson theorem, which states that either 
trajectories (in the finite region to be specified in a moment) are attracted to a 
fixed point as time goes on or they approach a limit cycle provided the relevant 
two-dimensional subsystem stays inside a finite region (i.e., does not diverge 
there as t 00 ). For a proof, we refer to Hirsch and Smale and Jackson in 
Additional Reading. 

In general, when some form of damping is present the transients decay and 
the system settles either in equilibrium (i.e., a single point) or in periodic or 
quasiperiodic motion. Chaotic motion is now recognized as a fourth state, and 
its attractors are often called strange. 


Dissipation in Dynamical Systems 


Dissipative forces, in two dimensions for simplicity, entail changes of area in 
phase space. An example is the parachutist in Example 8.2.2, where v —»■ i>o 
with increasing time, while v —> 0. The area in phase space spanned by v, v 
shrinks to zero due to drag. As particles on such trajectories carry energy, 
changing areas involve loss or gain of energy; they often involve velocities— 
that is first-order time derivatives such as friction (e.g., for the damped os¬ 
cillator). Let us look for a measure of dissipation; that is, how a small area 
A — c 12 Afi A §2 at a fixed point shrinks or expands. Here, ci ,2 = sin(fi, | 2 ) is a 
time-independent angular factor that takes into account the nonorthogonality 
of the characteristic directions fi and | 2 of Eq. (19.29). If we take the time 
derivative of A and use ^ j = k j^j of the characteristic coordinates, implying 
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SUMMARY 


A= XjA^j, we obtain, to lowest order in the A t-j, 


A — Ci,2[AfiA.2Af2 + A^A-i A£i] = Ci^A^i Af 2 (A.i + A 2 ). (19.35) 

In the limit A|j —> 0, we find from Eq. (19.35) that the rate 

j = A-i + A 2 = tr(F) = V ■ f| yo (19.36) 

with f = (/1 , J 2 ) the vector of time derivatives in Eq. (19.23b) at the fixed point. 
Note that the time-independent sine of the angle between £1 and I 2 drops out 
of the rate. The generalization to higher dimensions is obvious. Moreover, in 
n dimensions, 


trace(F) = ^Aj. (19.37) 

i 

Although F is not hermitian, this trace formula follows from the invariance of 
the secular polynomial in Eq. (19.30) under a linear transformation, C£ = x in 
particular, and it is a result of its determinental form using the product theorem 
for determinants (see Section 3.2), viz., 

det(F — A. • 1) = [det(C)] -1 det(F - A. • l)[det(C)] = det(C _1 (F - A ■ 1)C) 

n 

= det(C _1 FC — A • 1) = ]~[(Aj - A). (19.38) 

i= 1 

Here, the product form derives by substituting Eq. (19.31). Now tr(F) is the 
coefficient of (— A)” -1 upon expanding det(F — A • 1) in powers of A, whereas it 
is ' k i from the product form n,Pw — A), which proves Eq. (19.37). Clearly, 
according to Eqs. (19.36) and (19.37), 

• it is the sign (more precisely the trace) of the characteristic exponents of 
the derivative matrix at the fixed point that determines whether there is 
expansion or shrinkage of areas and volumes in higher dimensions near a 
critical point. 

Equation (19.36) clearly states that dissipation requires V ■ f(y) ^ 0, where 
i/j = fj and does not occur in Hamiltonian systems where V • f = 0. 

Moreover, in two or more dimensions, there are the following global 
possibilities: 

• The trajectory may describe a closed orbit (center or cycle). 

• The trajectory may approach a closed orbit (spiraling inward or outward 
toward the orbit) as t -» oc. In this case, we have a limit cycle. 

The local behavior of a trajectory near a critical point is also more varied in 
general than in one dimension: At a stable critical point all trajectories may 
approach the critical point along straight lines, curve in as in Example 19.4.6 
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with trace{X ) = —4, spiral inward (toward the spiral sink), or may follow 
a more complicated path. If all time-reversed trajectories move toward the 
critical point in spirals as t —*■ — oo, then the critical point is a divergent spiral 
point or spiral repellor. This is the case in Example 19.4.8 with traceQ ) — 1. 
When some trajectories approach the critical point while others move away 
from it, as in Example 19.4.7 with trace(X) = 0 (in general, the trace can be 
positive or negative at a saddle point), then it is called a saddle point. When 
all trajectories form closed orbits about the critical point, it is called a center, 
as in Example 19.4.9, where trace(X ) = 0. 


P Bifurcations in Dynamical Systems 


As in one dimension, such as bifurcations of the logistic map, a bifurcation in 
two or more dimensions is a sudden change in dynamics for specific parameter 
values, such as the birth of a sink-repellor pair of fixed points or their disap¬ 
pearance upon adjusting a control parameter (i. e., the motions before and after 
the bifurcation are topologically different). At a bifurcation point, not only are 
solutions unstable when one or more parameters are changed slightly but also 
the character of the bifurcation in phase space or in the parameter manifold 
may change. Sudden changes from regular to random behavior of trajectories 
are characteristic of bifurcations, as is sensitive dependence on initial condi¬ 
tions: Nearby initial conditions can lead to very different long-term behavior. 
If a bifurcation does not change qualitatively with parameter adjustments, it is 
called structurally stable. Note that structurally unstable bifurcations are un¬ 
likely to occur in reality because noise and other neglected degrees of freedom 
act as perturbations on the system that effectively eliminate unstable bifurca¬ 
tions from our view. Bifurcations (such as period doublings in maps) are impor¬ 
tant as one among many routes to chaos. Others are sudden changes in trajecto¬ 
ries associated with several critical points called global bifurcations. Often, 
they involve changes in basins of attraction and/or other global structures. The 
theory of global bifurcations is fairly complicated and still in its infancy. 

Bifurcations that are linked to sudden changes in the qualitative behavior 
of dynamical systems at a single fixed point are called local bifurcations. 
Specifically, a change in stability occurs in parameter space where the real 
part of a characteristic exponent of the fixed point alters its sign (i.e., moves 
from attracting to repelling trajectories or vice versa). The center manifold 
theorem states that at a local bifurcation only those degrees of freedom matter 
that are involved with characteristic exponents going to zero: IH(Aj) = 0. Lo¬ 
cating the set of these points is the first step in a bifurcation analysis. Another 
step consists in cataloguing the types of bifurcations in dynamical systems, to 
which we turn next. 

The conventional normal forms of dynamical equations represent a start 
in classifying bifurcations. For systems with one parameter (i.e., a one dimen¬ 
sional center manifold), we write the general case of NDE as follows: 


OO 


OO 


OO 



(19.39) 
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where the superscript on a (m) denotes the power of the parameter c with which 
bifurcations are associated. One-dimensional iterated nonlinear maps such as 
the logistic map of Section 19.2 (that occur in Poincare sections) of nonlinear 
dynamical systems can be classified similarly, viz., 

OO OO OO 

x n+ \ = ^2 + c a ^p x i + c 2 H-■ (19.40) 

j =o 3 =0 3=0 

The logistic map corresponds to c = p, a® = 1, a® = —1, with all other 
a® = 0. One of the simplest NDEs with a bifurcation is 

x = x 2 - c, (19.41) 

which corresponds to all = 0 except for a® = — 1 and a® = 1- For 
c > 0, there are two fixed points (recall x = 0) x± = ± sfc, with characteristic 
exponents 2x± so that X- is a sink and x + is a repellor. For c < 0, there are no 
fixed points. Therefore, as c -* 0 the fixed point pair disappears suddenly (i.e., 
the parameter value c = 0 is a repellor-sink bifurcation that is structurally 
unstable). This complex map (with c -* — c) generates the fractal Julia and 
Mandelbrot sets discussed in Section 19.3. 

A pitchfork bifurcation (as in Fig. 19.2 at p = 3) occurs for the undamped 
(nondissipative and special case of the Duffing) oscillator with a cubic anhar- 
monicity 


x + ax + bx 3 = 0, b > 0. (19.42) 

It has a continuous frequency spectrum and is, among others, a model for a 
ball bouncing between two walls. When the control parameter a > 0, there is 
only one fixed point at x — 0, a sink, whereas for a < 0 there are two more 
sinks at x± = ± -J—a jb. Thus, we have a pitchfork bifurcation of a sink at the 
origin into a saddle point at the origin and two sinks at x± ^ 0. In terms of a 
potential formulation, V (x) = ax 2 /2 + te 4 /4 is a single well for a > 0 but a 
double well (with a maximum at x = 0) for a < 0. 

When a pair of complex conjugate characteristic exponents p ± ik crosses 
from a spiral sink (p < 0) to a repelling spiral (p > 0) and periodic motion 
(limit cycle) emerges, we call the qualitative change a Hopf bifurcation. They 
occur in the quasiperiodic route to chaos that will be discussed in the next 
section. 

In a global analysis, we piece together the motions near various critical 
points, such as sinks and bifurcations, to bundles of trajectories that flow 
more or less together in two dimensions. (This geometric view is the current 
mode of analyzing solutions of dynamical systems.) However, this flow is no 
longer collective in the case of three dimensions, where they diverge from 
each other in general, because chaotic motion is possible that typically fills 
the plane of a Poincare section with points. 
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Chaos in Dynamical Systems 


Our previous summaries of intricate and complicated features of dynamical 
systems due to nonlinearities in one and two dimensions do not include 
chaos, although some of them, such as bifurcations, are sometimes precursors 
to chaos. This is why Verhulst’s NDE shows no chaos, whereas the correspond¬ 
ing (logistic) map does. In three- or more dimensional NDEs chaotic motion 
may occur (but does not always), often when a constant of the motion (e.g., 
an energy integral for NDEs defined by a Hamiltonian) restricts the trajecto¬ 
ries to a finite volume in phase space and when there are no critical points. 
Another characteristic signal for chaos is when for each trajectory there are 
nearby ones, some of which move away from it and others approach it with 
increasing time. The notion of exponential divergence of nearby trajectories is 
made quantitative by the Lyapunov exponent X (Section 19.3) of iterated maps 
of Poincare sections (Section 19.4) associated with the dynamical system. If 
two nearby trajectories are at a distance do at time t = 0 but diverge with a 
distance d(t) at a later time t, then d(t) ~ doe u holds. Thus, by analyzing 
the series of points (i.e., iterated maps generated on Poincare sections), one 
can study routes to chaos of three-dimensional dynamic systems. This is the 
key method for studying chaos. As one varies the location and orientation of 
the Poincare plane, a fixed point on it is often recognized to originate from 
a limit cycle in the three-dimensional phase space whose structural stability 
can be checked there. For example, attracting limit cycles show up as sinks 
in Poincare sections, repelling limit cycles as repellors of Poincare maps, and 
saddle cycles as saddle points of associated Poincare maps. 

Three or more dimensions of phase space are required for chaos to occur 
because of the interplay of the necessary conditions just discussed, viz., 


• bounded trajectories (often the case for Hamiltonian systems); 

• exponential divergence of nearby trajectories (guaranteed by positive 
Lyapunov exponents of corresponding Poincare maps); and 

• no intersection of trajectories. 


The last condition is obeyed by deterministic systems in particular, as dis¬ 
cussed in Section 19.1. A surprising feature of chaos mentioned in Section 19.1 
is how prevalent it is and how universal the routes to chaos often are despite 
the variety of NDEs. 

An example of spatially complex patterns in classical mechanics is the 
planar pendulum whose one-dimensional equation of motion 


^6 

dt 


= L, 


dL 

dt 


—Img sin 6 


(19.43) 


is nonlinear in the dynamic variable 0(f). Here, I is the moment of inertia, L is 
the orbital angular momentum, l is the distance to the center of mass, mis the 
mass, and g is the gravitational acceleration constant. When all parameters in 
Eq. (19.43) are constant in time and space, the solutions are given in terms of 
elliptic integrals (see Section 5.8) and no chaos exists. However, a pendulum 
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under a periodic external force can exhibit chaotic dynamics, for example, for 
the Lagrangian 

TYl (1 

C = —r 2 - ntj(l - z), (x - xof + y 2 + z 2 = l 2 , (19.44) 

Xq — el cos cot, 

where x 3 represents the periodic driving force (see Moon in Additional 
Reading). 

Good candidates for chaos are multiple well potential problems 

* + W(,) = r(r,|,<). (19.45) 

where F represents dissipative and/or driving forces. Another classic example 
is rigid body rotation whose nonlinear three-dimensional Euler equations are 
familiar, viz., 

d 

— IlO>l — (h — /3)®2«3 + Ml, 
dt 

d 

— hu >2 — (J 3 — h)x>ia >3 + M-i, (19.46) 

dt 

d 

—/3®3 = (/1 — /2)tt>lO>2 + M 3 , 

dt 

where (M \, M 2 , M,{) is the torque, Ij are the principal moments of inertia, and 
co is the angular velocity with components coj about the body-fixed principal 
axes. Even free rigid body rotation can be chaotic because its nonlinear cou¬ 
plings and three-dimensional form satisfy all requirements for chaos to occur 
(Section 19.1). A rigid body example of chaos in our solar system is the chaotic 
tumbling of Hyperion, one of Saturn’s moons that is highly nonspherical. It is 
a world in which the Saturn rise and set is so irregular as to be unpredictable. 
Another is Halley’s comet, whose orbit is perturbed by Jupiter and Saturn. In 
general, when three or more celestial bodies interact gravitationally, chaotic 
dynamics is possible. Note, though, that computer simulations over large time 
intervals are required to ascertain chaotic dynamics in the solar system. For 
more details on chaos in such conservative Hamiltonian systems, see Chapter 
8 of Hilbom in Additional Reading. 


Let us now examine some routes to chaos. The period-doubling route to chaos 
is exemplified by the logistic map in Section 19.2, and the universal Feigenbaum 
numbers a, 8 are its quantitative features along with Lyapunov exponents. It 
is common in dynamical systems. It may begin with limit cycle (periodic) 
motion, which shows up as a set of fixed points in a Poincare section. The 
limit cycle may have originated in a bifurcation from a sink or some other 
fixed point. As a control parameter changes, the fixed point of the Poincare 
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map splits into two points (i.e., the limit cycle has a characteristic exponent 
going through zero from attracting to repelling). The periodic motion now has 
a period twice as long as before, etc. We refer to Chapter 11 of Barger and 
Olsson in Additional Reading for period-doubling plots of Poincare sections 
for the Duffing equation [Eq. (19.42)] with a periodic external force. Another 
example of period doubling is a forced oscillator with friction (see Helleman 
in Cvitanovic in Additional Reading). 

The quasiperiodic route to chaos is also quite common in dynamical sys¬ 
tems (e.g., starting from a time-independent sink, a fixed point). If we adjust 
a control parameter the system undergoes a Hopf bifurcation to the periodic 
motion corresponding to a limit cycle in phase space. With further change of 
the control parameter, a second frequency appears. If the frequency ratio is 
an irrational number, the trajectories are quasiperiodic, eventually covering 
the surface of a torus in phase space (i.e., quasiperiodic orbits never close 
or repeat). Further changes of the control parameter may lead to a third fre¬ 
quency or directly to chaotic motion. Bands of chaotic motion can alternate 
with quasiperiodic motion in parameter space. An example of such a dynamic 
system is a periodically driven pendulum. 

A third route to chaos is via intermittency, where the dynamical system 
switches between two qualitatively different motions at fixed control param¬ 
eters. For example, at the beginning periodic motion alternates with an oc¬ 
casional burst of chaotic motion. With a change of the control parameter, the 
chaotic bursts typically lengthen until, eventually, no periodic motion remains. 
The chaotic parts are irregular and do not resemble each other, but one needs 
to check for a positive Lyapunov exponent to demonstrate chaos. Intermitten- 
cies of various types are common features of turbulent states in fluid dynamics. 
The Lorenz coupled NDEs also show intermittency. 


EXERCISES 

19 . 4.1 For the damped harmonic oscillator 

x + 2 ax + x — 0 , 

consider the Poincare section {x > 0, y = x — 0}. Take 0 < a <C 1 and 
show that the map is given by x n+ i — bx n with b < 1. Find an estimate 
for b. 

19 . 4.2 Show that the (Rossler) coupled ODEs 


±1 = — X2 - Xs, ±2 = X\ + diXo, ±3 = Cb2 + (Xi - <l3)X3 

(a) have two fixed points for a 2 = 2, CI 3 = 4, and 0 < cq < 2, 

(b) have a spiral repellor at the origin, and 

(c) have a spiral chaotic attractor for a\ = 0.398. 
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19 . 4.3 Construct a Poincare map for the Duffing oscillator in Eq. (19.42). 

19 . 4.4 Guess a particular solution of the Riccati NDE y' = A — ABx 2 y + Bxy 2 , 
where A, B are constants. Then reduce the NDE to a Bernoulli NDE 
and solve that also. 

ANS. y(x) = Ax+\, v(x) = C{x)e- AH ^- x3 '^, 
C(pc ) = B f x xe AB( - x2 - x3/ ^dx. 

19 . 4.5 Plot the intermittency region of the logistic map at g = 3.8319. What 
is the period of the cycles? What happens at g = 1 + 2\/2? 

ANS. There is a (so-called) tangent bifurcation to period 3 cycles. 
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Appendix 1 


Real Zeros 
of a Function 


The demand for the values of the real zeros of a function occurs frequently 
in mathematical physics. Examples include the boundary conditions on the 
solution of a coaxial waveguide problem (Example 11.5.1); eigenvalue prob¬ 
lems in quantum mechanics such as the deuteron with a square well potential 
(Example 9.1.2), that is, solutions of transcental equations from matching con¬ 
ditions in quantum mechanics; and the location of the evaluation points in 
Gaussian quadrature. 

Most numerical methods require close initial guesses of the zero or root. 
How close depends on how wildly your function is varying and what accuracy 
you demand. All are methods for refining a good initial value. To obtain the good 
initial value and to locate pathological features that must be avoided (such 
as discontinuities or singularities), you should make a reasonably detailed 
graph of the function. There is no real substitute for agraph. Exercise 11.3.11 
emphasizes this point. 

Newton’s method is often presented in differential calculus because it as¬ 
sumes the function f(x ) to have a continuous first derivative and requires 
its computation. If no pathologies occur, it converges rapidly. However, it is 
a method to avoid because it is treacherous: It may fail to converge or may 
converge to the wrong root. When iterated, it may lead to chaos for suitable 
initial guesses. We therefore do not discuss it in detail and refer to Chapter 9 
of Press et al. in Additional Reading. 


Bisection Method 


This method assumes only that f(x) is continuous. It requires that initial 
values Xf and x r straddle the zero being sought. Thus, f(xi ) and f(x r ) will 
have opposite signs, making the product f(x{) ■ f(x r ) negative. In the simplest 
form of the bisection method, take the midpoint x m = \(xi + x r ) and test to 
see which interval [xi, x„,\ or [x m , x r \ contains the zero. The easiest test is to 
see if one product, for example, f(Xm) ■ f(x r ) < 0. If this product is negative, 
then the root is in the upper half interval [x m , x r ]; if it is positive, then the 
root must be in the lower half interval [xi, x m ]. Remember, we are assuming 
f{pc) to be continuous. The interval containing the zero is relabeled [x,j , x r ] and 
the bisecting continues (as in Fig. A.l) until the root is located to the desired 
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Figure A.l 

Bisection 

Root-Finding 

Method 



Figure A.2 
A Simple Pole, 

/(*,)•/(*/■) < o 

But No Root 



degree of accuracy. Of course, the better the initial choice of X/ and x r , the 
fewer bisections required. However, as explained subsequently, it is important 
to specify the maximum number of bisections that will be permitted. 

This bisection technique may not have the elegance of Newton’s method, 
but it is reasonably fast and much more reliable—almost foolproof if you avoid 
discontinuous functions, such as f{pc) — 1 /(x — a), shown in Fig. A.2. Again, 
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there is no substitute for knowing the detailed local behavior of your function 
in the vicinity of your supposed root. 

In general, the bisection method is recommended. 


Three Warnings 


1. Always plot your function before performing any root finding. Compare the 
results from your root finding to your graph. 

2. Since the computer carries only a finite number of significant figures we 
cannot expect to calculate a zero with infinite precision. It is necessary to 
specify some tolerance. When the root is located to within this tolerance, 
the subroutine returns control to the main calling program. 

3. All the approaches mentioned here are iteration techniques. How many 
times do you iterate? How do you decide to stop? It is possible to program 
the iteration so that it continues until the desired accuracy is obtained. The 
danger is that some factor may prevent reasonable convergence. Then your 
tolerance is never achieved and you have an infinite loop. It is far safer to 
specify in advance a maximum number of iterations. Thus, these subrou¬ 
tines will stop when either a zero is determined to within your specified 
tolerance or the number of iterations reaches your specified maximum— 
whichever occurs first. With a simple bisection technique the selection of 
a number of iterations depends on the initial spread x r — X; and on the 
precision you demand. Each iteration will cut the range by a factor of 2. 
Since 2 10 = 1024 ~ 10 3 , 10 iterations should add three significant figures; 
20 should add six significant figures to the location of the root. 


EXERCISES 

Al.l Write a simple bisection root determination subroutine that will deter¬ 
mine a simple real root once you have straddled it. Test your subroutine 
by determining the roots of one or more polynomials or elementary tran¬ 
scendental functions. 

A1.2 Try the bisection method to locate a root of the following functions: 

(a) fix') = x 2 — 1, and;ro = 0.9, 1.1 

(b) fix) = ix 2 - l) 1 / 2 , x 0 = 0.9, 1.1 

(c) fix) — sin a;, x 0 = 3.0, 3.2 

(d) fix) — tanha? — x/2, x 0 — 1.8, 2.0. 

A1.3 The theory of free radial oscillations of a homogeneous earth leads to an 
equation 

x 


The parameter a depends on the velocities of the primary and secondary 
waves. For a — 1.0, find the first three positive roots of this equation. 


ANS. xi = 2.7437 ^2 = 6.1168 x 3 = 9.3166. 
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A1.4 (a) Using the Bessel function ffx) generated by function BESS.iO(.r) of 
Press et al. in Additional Reading, locate consecutive roots of Jq(x): 
a„ and a n+ \ for n — 5, 10, 15, ..., 30. Tabulate a n , ot n +i, («»,+1 — o',,), 
and (a n+ i — u n ) /jr . Note how this last ratio is approaching unity. 

(b) Compare your values of a n with values calculated from McMahon’s 
expansion, AMS-55, Eq. (9.5.12). 


Additional Reading 

Hamming, R. W. (1971). Introduction to Applied Numerical Analysis. McGraw- 
Hill, New York. Reprinted, Hemisphere, New York (1989), especially Chap¬ 
ter 2. In terms of the author’s insight into numerical computation 
and his ability to communicate to the average reader, this book is 
unexcelled. 

Press, W. H., Flannery, B. R, Teukolsky, S. A., and Vetterling, W. T. (1996). 
Numerical Recipes , 2nd ed. Cambridge Univ. Press, Cambridge, UK. 
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Bernoulli, Jacques, 302 

Bernoulli equation(s), 423, 879 

Bernoulli functions, 305 

Bernoulli NDE, 879-880 

Bernoulli numbers, 302-305, 541 

Bernoulli polynomials, 310 

Bessel, Friedrich Wilhelm, 590 

Bessel asymptotic forms, 405^06, 617-623 

Bessel equation, 441, 446-448, 458-460, 

476, 624 

Bessel functions, 486, 589-637 
asymptotic forms, 405^06, 617-623 
Bessel series, 600 
Bessel’s ODE, 476, 595 
differential equations, 594-595 
functions of the first kind, 589-611 
functions of the second kind, 611-617 
generating function for integral order, 
590-593 

Hankel functions, 619-621 
integral representations, 595-596 
Laplace equation in cylindrical coordinates, 
474-476 

Neumann functions, 611-617 
nonintegral order, of, 608 
normalization, 600 
orthogonality, 599-600 
recurrence relations, 593 
series solutions, 441-464 
spherical, 624-637. See also Spherical 
Bessel functions 

Wronskian relation or formula, 614, 616, 
623, 635 
zeros, 598-600 
Bessel inequality, 512-513 
Bessel integral, 596-597, 609-611, 613, 
617-619, 635-636 
Bessel’s ODE, 476, 595 
Bessel singularity, 441 
Bifurcation 

dynamic systems, 898-899 
global, 898 
Hopf, 893, 899 
local, 898 

logistic map, 871-872 
pitchfork, 871, 899 
tangent, 903 
Bifurcation plot, 871 
Binomial coefficient, 285 
Binomial distribution, 802-804, 810 
Binomial theorem, 284-286 
Biographical data 
Abel, Niels Hendrik, 280 
Bessel, Friedrich Wilhelm, 590 


Cauchy, Augustin Louis, 343 
Cramer, Gabriel, 162 
d’Alembert, Jean Le Rond, 834 
Descartes, Rene, 6 
Dirac, Paul, 92 
Euler, Leonhard, 201-202 
Fourier, Jean Baptiste, 669 
Frobenius, Georg, 448 
Gauss, Carl Friedrich, 69-70 
Green, George, 70 
Hamilton, William R., 842 
Helmholtz, Hermann von, 763 
Hilbert, David, 642 
Jacobi, Carl, 120 
Kronecker, Leopold, 141 
Laguerre, Edmond Nicolas, 655 
Laplace, Pierre Simon, 54 
Laurent, Pierre-Alphonse, 357 
Legendre, Adrien Marie, 558 
Leibniz, Gottfried Wilhelm von, 172 
Lie, Sophus, 233 
Liouville, Joseph, 501 
Lorentz, Hendrik Antoon, 254 
Maclaurin, Colin, 286 
Neumann, Karl, 613 
Pascal, Blaise, 825 
Pauli, Wolfgang, 209 
Poincare, Jules Henri, 869 
Poisson, Simeon Denis, 85 
Rayleigh, Lord, 861 
Riemann, Bernhard, 335 
Stokes, Sir George Gabriel, 75 
Sturm, Jacques Charles, 501 
Taylor, Brooke, 282-283 
Weierstrass, Karl T. W., 528 
Wigner, Eugen Paul, 233 
Wronski, Jozef Maria, 436 
Biot-Savart law, 35, 111 
Bisection method, 905-909 
Bisection root-finding method, 905-906 
Bivectors, 208 

Bloch-Griineissen approximation, 311 
Block-diagonal, 235 
Bohr radius, 549, 658, 719 
Bom approximation, 773, 779-780 
Bose-Einstein (BE) statistics, 788, 859 
Boundary conditions, 471n 
Cartesian, 473-474 
Cauchy, 758, 759 
Dirichlet, 758, 759 

electrostatic potential of ring of charge, 574 
forms, 758 

initial conditions, 759 
Laguerre polynomials, 655 
maximum surface tension, 852 
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Neumann, 758, 759 
PDEs, 758-760 

Rayleigh-Ritz variational technique, 861 
self-adjoint ODEs, 490 
Sturm-Liouville theory, 490 
waveguide, coaxial cable, 614 
Boundary value problems, 758 
Box-counting dimension, 875-876 
Branch point, 363-367, 374, 377 
Bromwich integral, 729, 746-752 
Burlish-Stoer method, 467 
Butterfly effect, 868 

c 

C n , 10n 

Calculus of residues, 378-400 
Cauchy principle value, 384-390 
evaluation of definite integrals, 379-384 
inversion, 748 
Jordan’s lemma, 383 

pole expansion of meromorphic functions, 
390-391 

product expansion of entire fimctions, 
392-394 

residue theorem, 378-379 
Rouche’s theorem, 393 
Calculus of variations, 826-866 
concept of variation, 827 
constraints, 850-852 
dependent and independent variable, 
827-837 

Euler equation, 829 
ground-state eigenfunction, 862-863 
Hamilton’s principle, 841 
Lagrangian equations, 841-843, 853 
Lagrangian multipliers, 848-850 
Rayleigh-Ritz variational technique, 861-865 
several dependent/independent variables, 
847-848 

several dependent variables, 837-845 
several independent variables, 845-847 
soap film, 832-834 
Sturm-Liouville equation, 861-863 
transversality condition, 839-841 
uses, 826 

variation with constraints, 850-852 
Capsule biographies. See Biographical data 
Cartesian coordinate systems, 96-98 
Catalan’s constant, 269, 296, 313, 537 
Catenoid, 834 

Cauchy, Augustin Louis, Baron, 343 
Cauchy boundary conditions, 758, 759 
Cauchy-Bromwich integral, 751 
Cauchy convergence test, 263-264 
Cauchy criterion, 260 


Cauchy inequality, 347 
Cauchy integral for powers, 339 
Cauchy integral formula, 344-348, 750 
Cauchy integral theorem, 337-343 
Cauchy integral test, 264-266 
Cauchy principal value, 384—390 
Cauchy product of two series, 275 
Cauchy ratio test, 263-264 
Cauchy-Riemann conditions, 331-337 
Cauchy root test, 263 
Cayley-Klein parameters, 139n, 241, 242 
Celestial Mechanics (Laplace), 54 
Center, 892, 898 
Center manifold theorem, 898 
Center of mass of three points at corners of 
triangle, 7-8 
Central moments, 794 
Centrifugal potential, 79-80 
Chain rule 

energy conservation of conservative force, 
117 

integrals in cylindrical coordinates, 101 
orthogonal coordinates, 116 
partial derivatives, 42n 
Chain rule of differentiation, 18 
Chaos, 867-878 
bifurcation, 871-872 
butterfly effect, 868 
deterministic, 868 
dynamic systems, in, 900-902 
fourth state, as, 896 
fractals, 875-878 
intermittency, 902 
logistic map, 869-874 
Lyapunov exponent, 874—875 
order, 873 

period doubling, 901-902 
Poincare, 867 
quasiperiodic route to, 902 
sensitivity to initial conditions/parameters, 
874-878 

sensitivity to small pertubations, 872 
Chaotic tumbling of Hyperion, 901 
Characteristic exponent, 886 
Characteristic function, 794 
Characteristic value, 214n 
Characteristic vector, 214n 
Charge form factor, 701-703 
Chebyshev differential equation, 478 
Chebyshev inequality, 793-794 
Chebyshev polynomial technique, 543 
Chi square fits, 817 
Chi square distribution, 817-821 
Circular aperture, 597 
Circular cylinder coordinates, 98-113, 759 
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Circular membrane, 600-604 
Clairaut’s ODE, 419^21 
Classical harmonic oscillator, 731-732 
Closure, 230 

Coaxial waveguides, 614-615 
Cofactor, 165 
Coin tossing, 783 
Combinations, 787-788 
Commutative, 2 
Commutator, 223 

Commutator bracket symbol, 180n 

Comparison test, 262-263 

Complete elliptic integral of the first kind, 

298 

Complete elliptic integral of the second kind, 
298 

Complete solution, 460 
Completeness of eigenfunctions, 510-522 
Bessel’s inequality, 512-513 
expansion (Fourier) coefficients, 518-519 
Schwarz inequality, 513-514 
Sturm-Liouville eigenfunctions, 510-511 
vector space, 515-518 
Complex algebra, 319-331 
Complex conjugation, 321-323 
Complex form of Ohm’s law, 327 
Complex Fourier series, 677-682 
Complex number. See also Complex variable, 
Functions of a complex variable 
addition, 321 

conversion to polar form, 323-324 
division, 322 
multiplication, 321 
square root, 326 
subtraction, 321 

Complex plane-Argand diagram, 320 
Complex variable. See also Complex number, 
Functions of a complex variable 
areas of application, 318-319 
defined, 320 

multiplication by, 324-325 
real/imaginary part, 320 
Condition number, 220 
Conditional convergence, 271 
Conditional probability, 785 
Conditional probability distribution, 796 
Condon-Shortley phase, 585 
Confidence interval, 821, 823-824 
Confluent hypergeometric functions, 545 
Conformal mapping, 368-370 
Conformal transformations, 369-370 
Conjugate group elements, 232 
Conjugate subgroup, 236 
Connected domain, simplify or multiply, 340n, 
341-342 


Conservation laws, 229 
Conservative, 41 
Constant of separation, 472 
Constrained Lagrangian equations of motion, 
853 

Constraints, 848-857 
Continuity equation, 46 
Continuous random variable, 790 
Contour integrals, 337-339 
Contraction, 149 
of tensors, 149 

of vectors, scalar product, 12-20, 149 
Contravariant four-vectors, 144 
Contravariant tensor, 144 
Contravariant vector, 143 
Convergence 
absolute, 271, 274 
Cauchy criterion, 260 
conditional, 271 
defined, 258 

Fourier series, 272-273, 666, 682-683 
improvement, 307-310 
nonuniform, 277-278 
power series, 291 
Riemann zeta function, 308-309 
square wave, 669 
tests, 262-269 
uniform, 276-277 
Convergence tests, 262-269 
Cauchy root test, 263 
comparison test, 262-263 
d’Alembert ratio test, 263-264 
Maclaurin integral test, 264-266 
Convolution 

Fourier transform, 712-717 
Laplace transform, 742-746 
Convolution integral, 713-714 
theorem, 712-715, 742-745 
Coordinate rotation, 231 
Coordinate systems, 96-97. See also Curved 
coordinates 
Correlation, 795 
Cosine integral, 547 
transform, 699-700 
infinite product representation, 392 
Coulomb electrostatic force law, 84 
Coulomb excitation of nuclei, 587 
Coulomb inverse square expression, 85 
Coulomb law, 82n, 574, 769 
Coulomb potential, 45 
Coulomb potential by convolution, 714-715 
Course In Modem Analysis, A, 300 
Covariance, 138, 248, 795 
Covariance of cross product, 142 
Covariance of gradient, 143 
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Covariant four-vectors, 144 

Covariant tensor, 144 

Covariant vector, 143 

Cramer, Gabriel, 162 

Cramer’s rule, 162, 168, 172 

Creation operator, 641 

Criterion, of convergence (Leibniz), 

270 

Critical point, 882 
Cross product, 20-24 
Crystal lattice, 490 
Curl 

Cartesian coordinate, 47-61, 98 
central force, of, 48-61 
curvilinear coordinates, 124-126 
cylindrical coordinates, 110-111 
integral definition, 66-67 
irrotational, 50 

spherical polar coordinates, 129 
vector analysis, 47-51 
Curved (curvilinear) coordinates, 96-136 
circular cylinder coordinates, 98-113, 759 
curl, 124-125 
divergence, 122-123 
gradient, 121-122 
orthogonal coordinates, 113-121 
rectangular Cartesian coordinates, 97-98 
separation of variables, 470-480 
spherical polar coordinates, 126-136, 760 
Cut line, 326, 365-366 
Cycle, 869, 891 

Cylindrical coordinates, 98-113 
Cylindrical resonant cavity, 604-608 
Cylindrical traveling waves, 621 

D 

d’Alembert, Jean Le Rond, 834 

d’Alembert ratio test, 263-264 

Damped oscillator, 734-735 

Damped pendulum, 892 

de la Vallee-Poussin, Ch., 273 

De Moivre’s formula, 325-326 

Debye functions, 313 

Debye temperature characteristic, 311 

Decay law, 412 

Definite integral, 379-384 

Degeneracy, 501 

Degenerate eigenvalues, 217-218 
Delta function. See Dirac delta function 
Derivatives, 35-58, 346 
partial, 35-40 
total variation, 37 

Derivatives of elementary functions, 

334 

Descartes, Rene, 6 


Determinants, 159-174 
antisymmetry, 166 

battery-powered circuit, 159-160, 162-164 
characteristic value or vector, 214n 
Gauss elimination, 168-172 
Hilbert, 164-165 

homogeneous linear equations, 160-161 
inhomogeneous linear equations, 161-162 
Laplacian development by minors, 164 
linear equations, 159-162, 167-168 
matrix product, of, 181-182 
order, 163 

product theorem, 180-181 
secular equation, 214 
Deterministic chaos, 868 
Deuteron, 147, 488^89, 722 
Diagonal matrices, 182-183 
Diagonalization of matrices, 211-228 
anti-Hermitian matrices, 216 
eigenvectors/eigenvalues, 212-214 
exponential of diagonal matrix, 222-223 
function of matrices, 221-223 
Hermitian matrices, 214-216 
ill-conditional systems, 220-221 
moment of inertia matrix, 211-212 
normal modes of vibration, 218-220 
Difference equation, 524 
Differential equations, 410-481 
Bessel equation, 446-448, 458^460 
Bessel ODE, 594-595 
Bessel singularity, 441 
Clairaut’s ODE, 419-421 
Euler’s ODE, 427-428 
exact, 413-414 
first-order ODEs, 411-424 
first-order ODEs with y/x dependence, 419 
Frobenius solution, 441^54 
Fuchs’s theorem, 450 
higher-order, 758 

homogeneous/inhomogeneous ODE, 410, 
414, 415 

inhomogeneous Euler ODE, 430^31 
inhomogeneous ODE with constant 
coefficients, 431-432 
Laguerre ODE, 650-652 
Legendre polynomials, 564-565 
linear first-order ODEs, 414-418 
linear ODE, 410 

nonlinear, 878-903. See also Nonlinear 
differential equations (NDEs) 
numerical solutions, 464-470 
ODE, 410 

ODEs with constant coefficients, 428^429 
PDEs, 756-781. See also Partial differential 
equations (PDEs) 
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Differential equations (cont.) 
power series expansion method, 441^54 
predictor-collector methods, 467-468 
regular/irregular singularities, 448^450 
Runge-Kutta method, 466-467 
second-order ODEs, 424-439 
second solution, 454—464 
third solution, absence of, 435-436 
self-adjoint ODEs, 483^96 
separable ODE, 411^16 
separation of variables, 470^80. See also 
Separation of variables 
series solutions (Frobenius’s method), 
441^154 

singular points, 439-441 
superposition principle, 411 
symmetry, 445^446 
Taylor series solution, 464-465 
Differentiating a vector function, 17 
Differentiation, 685 
Diffraction, 597 
Diffusion PDE, 760-761 
Digamma function, 535-536 
Dirac, Paul Adrien Maurice, 92 
Dirac bra-ket notation, 178, 514 
Dirac delta function, 86 
examples, 87-92 
exercises, 681, 687 
Fourier transform, 696-697 
Laplace transform, 732 
orthogonal azimuthal functions, 587 
Parseval’s relation, 716 
Dirac delta function expansions, 574 
Dirac ket, 178 
Dirac notation, 504 
Dirac relativistic wave functions, 539 
Direct product, 149-151, 182 
Direct product of vectors, 150-151 
tensors, 149-150 
Direct sum of vector spaces, 182 
Direction cosines, 4, 194-195 
Dirichlet boundary conditions 572, 628, 634, 
664, 758, 759 
Dirichlet kernel, 89 
Dirichlet problem, 767 
Dirichlet series, 280 
Discrete random variable, 789-790 
Disquisitions Arithmetica (Gauss), 69 
Dissipation in dynamic systems, 896-898 
Distance in cylindrical coordinates, 

100-101 

Divergence, 257 
central force field, 44—46 
curvilinear coordinates, 122-123 
cylindrical coordinates, 108-109 


integral definition, 66-67 
spherical polar coordinates, 129 
vector analysis, 44^7 
Diverging partial sums, 257-258 
Dot product, 12-20 
Double factorial notation, 530 
Double well, 899 

Driven harmonic oscillator, 707-708 
Driven oscillator with damping, 743-745 
Driving term, 414 
Drum, 600-604 
Dual tensors, 153-157 
Duffing equation, 902 
Duffing oscillator, 899 
Dynamic systems 
bifurcation, 898-899 
chaos, 900-902 
dissipation, 896-898 
routes to chaos, 901-902 

E 

Eigenvalue/eigenvector, 212-214 
completeness, 510-522. See also 

Completeness of eigenfunctions 
degenerate eigenvalues, 217-218 
deuteron, 488-489 
ground-state eigenfunction, 862-863 
Hermitian operators, 496-500 
orthogonal eigenfunctions, 498-500 
orthogonality, 485 
real eigenvalues, 496-497 
real symmetric matrix, 216-217 
self-adjoint ODEs, 485-489 
variational principle for, 861-864 
Einstein’s energy relation, 252 
Einstein’s nonlinear field equations, 115 
Einstein’s summation convention, 146 
Einstein’s velocity addition law, 251, 254-255 
Elastic forces, 8-9 
Electric circuit, 327-328 
Electric dipole moment, 559 
Electric dipole potential, 559 
Electrical circuit, 432 
Electromagnetic field tensor, 147 
Electromagnetic wave equations, 56-57, 736 
Electromagnetic waves, 736-737, 749-752 
Electrostatic multipole expansion, 560 
Electrostatic two-hemisphere problem, 274 
Electrostatics, 552-553 
Ellipsoid volume, 131 
Elliptic integral, 296-302 
Elliptic integral of the first kind, 297-298 
Elliptic integral of the second kind, 298 
Elliptic PDE, 470 
Empty set, 784 
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Energy conservation for conservative force, 
117-118 

Entire function, 335, 392 
Equality (matrix), 175 
Equation 

Bernoulli, 879 

Bessel, 441, 446-448, 458-460, 476 
continuity, 46 
difference, 524 

differential. See Differential equations 

Duffing, 902 

Euler, 729 

heat flow, 471 

Helmholtz, 762, 768 

indicial, 443 

Klein-Gordon, 757 

Laplace, 571, 572, 756, 846 

Legendre, 487, 565 

linear, 159-162, 167-168 

Poisson, 84-85, 756 

Riccati, 880 

secular, 214 

Schrodinger. See Schrodinger (wave) 
equation 

Sturm-Liouville, 861 
wave, 709 

Equation of constraint, 849 
Equations of motion and field equations, 
152-153 

Equipotential surface, 42 
Error estimates, 36 
Error function, 314-315 
Error integral, 548 

asymptotic expansion, 314-316 
Essential singularity, 373, 440 
Euler, Leonhard, 201-202 
Euler angle rotation matrix, 202, 205 
Euler angles, 200-201 
Euler equation, 829, 830 
Euler identity, 222, 748, 889 
Euler infinite limit definition, 523-524 
Euler integral, 406, 524-526, 543, 693 
Euler-Lagrange equations, 851 
Euler-Maclaurin integration formula, 306-307, 
540-541 

Euler-Mascheroni constant, 266, 267 
Euler ODE, 427-428 
Euler rotations, 205 
Euler solution, 465 

Evaluation of definite integrals, 379-384 

Even parity, 446 

Even permutation, 155n 

Even symmetry, 445 

Exact differential equations, 413^14 

Expansion (Fourier) coefficients, 518-519 


Expectation value, 245, 493, 791 
Exponential integral, 546-548 
External force, 20 

Extremum mider a constraint, 38-39 

F 

Fabiy-Perot interferometer, 329 
Factorial function, 406. See Gamma function 
(factorial function) 
complex argument, 523-527 
contour integral, 531 
Legendre duplication formula, 528, 635 
steepest descent asymptotic formula, 
406-407 

Factorial notation, 528-530 
Famous physicists/mathematicians. See 
Biographical data 
Faraday’s induction law, 56, 74 
Faraday’s law, 74 
Faltungs theorem 

Fourier transform, 712-717 
Laplace transform, 742-746 
FD statistics, 788, 859 
Feigenbaum number, 872, 873, 901 
Fermat’s principle of optics, 836 
Fermi-Dirac (FD) statistics, 788, 859 
Feynman diagrams, 91 
Field, 180 
Finite sum, 257 
Finite wave train, 704-705 
First-order ODEs, 411^424, 470 
Fixed points, 870, 882-884 
Fixed singularities, 881-882 
Flavor symmetry, 229 
Form invariance, 138 
Formula 

Baker-Hausdorff, 223 
Cauchy’s integral, 344-348 
De Moivre’s, 325-326 
Euler-Maclaurin, 306-307 
Legendre’s duplication, 528 
Rodrigues’, 579 
Stirling’s, 542, 543 
trace, 222 

Foundations of Geometry (Hilbert), 519 
Four-dimensional Clifford algebra, 208 
Fourier, Jean Baptiste Joseph, Baron, 669 
Fourier analysis, 663 
Fourier-Bessel expansion, 763 
Fourier coefficients, 518-519 
Fourier convolution, 712-717 
Fourier cosine series, 672, 687 
Fourier cosine transform, 609, 695, 699-700, 
703 

Fourier-Mellin integral, 747 
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Fourier series, 663-688 
Abel’s theorem, 678 
absolute convergence, 667, 683 
advantages, 671-673 
change of interval, 674 
completeness, 664-666 
complex, 677-682 
conditional, 666, 669 
convergence, 272-273, 666, 682-683 
defined, 663 
differentiation, 685 
discontinuities, 666-667 
eigenfunctions, 664 
general properties, 663-671 
integration, 684—685 
orthogonality, 499, 664 
periodic functions, 671-673 
saw tooth, 665-666 
square wave, 668 
table of functions, 686 
Fourier sine coefficients, 682 
Fourier sine series, 672 
Fourier sine transform, 609, 700, 703 
Fourier transform, 690-716 
convolution theorem, 712-717 
cosine transform, 699-700, 703 
derivatives, of, 706-707 
exponential transform, 698-699 
finite wave train, 704-705 
Gaussian, of, 692-693 
inverse, 694-698 
inversion theorem, 698 
momentum representation, 718-721 
Parseval’s relation, 715-716 
sine transform, 700, 703 
Fourth-order Runge-Kutta method, 467 
Fractals, 875-878 
Fractional order, 346n 
Fraunhofer diffraction circular aperture, 
597-598 

Fraunhofer diffraction optics, 716 
Fredholm integral equations, 743 
Free particle motion, 14-17 
Fresnel integrals, 288, 398, 633, 634 
Frobenius, Georg, 448 
Frobenius’s method, 441^54 
Fuchs’s theorem, 450 
Full-wave rectifier, 667-668 
Function 
analytic, 335 

Bessel. See Bessel functions 
characteristic, 794 
digamma, 535-536 
entire, 335, 392 
factorial, 406 


Hankel, 405^06, 619-621 
harmonic, 336 
meromorphic, 372, 390 
multivalued, 326, 374 
Neumann, 611-617 
nonperiodic, 694 

orthogonal. See Orthogonal functions 

orthonormal, 503 

periodic, 671-673 

polygamma, 536 

rational, 390-391 

real zeros, 905-909 

weight, 485-487 

Function with two branch points, 374-377 

Functional, 828n 

Functions of a complex variable, 318-409. See 
also Complex number, Complex variable 
analytic functions, 334 
analytic landscape, 400-402 
analytical continuation, 352-354 
branch points, 374 

calculus of residues, 378-400. See also 
Calculus of residues 
Cauchy integral formula, 344-348 
Cauchy integral theorem, 337-343 
Cauchy principal value, 384—390 
Cauchy-Riemann conditions, 331-337 
complex conjugation, 321-323 
conformal mapping, 368-370 
contour integrals, 337-339 
De Moivre’s formula, 325-326 
derivatives, 346 

derivatives of elementary functions, 334 

electric circuit, 327-328 

Lament expansion, 350-359 

Lament series, 354-358 

mapping, 360-370. See also Mapping 

method of steepest descents, 400^09 

Morera’s theorem, 346-347 

poles, 373-374 

positive quadratic form, 320 

residue theorem, 378-379 

saddle point method, 402^07 

Schwarz reflection principle, 351-352 

singularities, 372-377 

Taylor expansion, 350-351 

Fundamental organizing principle of natme, 
868 

Fundamental theorem of algebra, 348 

G 

Galilean transformation, 250 

Gamma function (factorial function), 523-551 
definitions, 523-528 
diganuna function, 535-536 
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double factorial notation, 530 
error integral, 548 

Euler infinite limit definition, 523-524 
Euler integral, 524-526 
exponential integral, 546-548 
factorial notation, 528-530 
incomplete gamma function, 545 
integral representation, 531-532 
Maclaurin expansion, 537 
numerical evaluation, 542-544 
polygamma function, 536 
series summation, 537 
Stirling’s series, 541-542 
Weierstrass infinite product definition, 
526-528 

Gauge invariance, 229 

Gauss, Carl Friedrich, 69-70, 792 

Gauss distribution, 807-812 

Gauss elimination, 168-172 

Gauss hypergeometric series, 268 

Gauss-Jordan elimination, 170, 171 

Gauss-Jordan matrix inversion, 186-188 

Gauss law, 82-84 

Gauss notation, 529 

Gauss-Seidel iterative method, 171, 223 

Gauss theorem, 68-69 

Gauss’s law, 82-84 

Gaussian integrals, 533 

Gaussian quadrature, 578 

Gell-Mann, Murray, 229 

General solution of homogeneous ODE, 416 

Generating function, 303n, 395 

Generator of a Lie group, 237-242 

Geometric calculus, 208 

Geometric series, 258-259 

Geometry of metric spaces, 69 

Gibbs phenomenon, 666 

Global behavior, 885, 898 

Global bifurcations, 898 

Goursat’s contribution, 346 

Gradient 

Cartesian coordinates, 40-43, 97 
covariance of, 143 

curvilinear coordinates, 107, 121-122 
cylindrical coordinates, 107-108 
dot product, of, 51 
function of r, of, 41-42 
integral definition, 66-67 
Minkowski space, in, 144 
partial derivative, 35 
spherical polar coordinates, 129 
vector analysis, 35-44 
vector operator, as, 40-41 
Gram-Schmidt orthogonalization, 503-510 
Gravitational Poisson equation, 85 


Gravitational potential, 61-62 
Gravitons, 147 
Great circle, 829 
Green, George, 70 
Green’s function, 769-776 
inhomogeneous PDE, 769-771 
quantum mechanical scattering, 774—776 
spherical, 778 
Green’s theorem, 70 
Gregory series, 290 
Grormd-state eigenfunction, 862-863 
Group,230 

Group theory, 229-256 
Cayley-Klein parameters, 241, 242 
continuous, 230 
coordinate rotation, 231 
generators, 237-243 
group, defined, 230 

homogeneous Lorentz group, 248-254 
homomorphism, 234 
isomorphism, 234 
kaon decay, 253 

ladder operator approach, 244—247 
Lie groups, 230 

orbital angular momentum, 243-248 
pion photoproduction threshold, 253 
reducible/irreducible, 235 
rotation groups SO(2) and SO(3), 
238-239 

rotation of functions/orbital angular 
momentum, 239-240 
similarity transformation, 231-233 
simple unitary groups, 233 
special unitary group SU(2), 240-242 
spin 1/2 states, 247 
subgroup, 231 

vector analysis in Minkowski space-time, 
251-253 

H 

Hadamard, J., 273 
Halley’s comet, 901 
Hamilton, William R., 842 
Hamilton principle-Lagrange equation 
formulation, 841-842 
Hamilton’s principle, 841, 853 
Hamiltonian, 641, 720 
Hamming predictor-corrector method, 468 
Hankel function, 405-406, 619-621 
asymptotic forms, 617-621 
Hankel function integral representation, 
609 

Harmonic function, 336 
Harmonic oscillator. See Simple harmonic 
oscillator 



920 


Index 


Harmonic series, 259 
alternating, 270 
ratio test, 264 
rearrangement, 271-272 
regrouping, 259-260 
Hausdorff dimension, 875-877 
formula. See Baker-Hausdorff, 223 
Heat flow equation, 471 
Heat flow PDE, 708, 760-761 
Heaviside expansion theorem, 753 
Heaviside step function, 89n 
Heaviside unit step function, 696, 728, 

738, 748 

Heisenberg formulation of quantum 
mechanics, 225 

Heisenberg uncertainty principle, 628 
Helmholtz, Hermann Ludwig Ferdinand von, 
763 

Helmholtz’s equation, 628, 759, 762, 768 
Helmholtz’s PDE, 763, 776 
Helmholtz’s resonator, 633 
Hermite, Charles, 642 
Hermite differential equation, 452, 462 
Hermite equation, 478 
Hermite polynomials, 89, 509, 638-650 
alternate representations, 645 
first few polynomials, 643-644, 646 
generating function, 644 
orthogonal functions, 486, 492, 508 
orthogonality, 646-647 
parity, 644-645 

raising/lowering operators, 639-642 
recurrence relations, 643 
Rodridgues representation, 645 
simple harmonic oscillator, 638-639 
Hermitian conjugates, 246 
Hermitian matrix, 207, 209, 214-216 
eigenvalues, eigenvectors, 214-218 
Hermitian number operator, 641 
Hermitian operators, 491-503 
degeneracy, 501 

orthogonal eigenfunctions, 498-500 
properties, 496 
quantum mechanics, 492-493 
real eigenvalues, 496-497 
square wave, 500 

High-frequency components, 666, 669 
Higher order differential equations, 

758 

Hilbert, David, 519 
Hilbert determinant, 164-165 
Hilbert matrix, 221 
Hilbert space, 510, 511, 519 
Histogram, 803 
Holomorphic function, 335n 


Homogeneous Helmholtz equation, 776 
Homogeneous linear equations, 160-161 
Homogeneous Lorentz group, 248-254 
Homogeneous medium, 838 
Homogeneous ODE, 411, 414, 415 
Homogeneous PDE, 470 
Homomorphism, 234 
Hooke’s law constant, 720 
Hopf, E., 893 

Hopf bifurcation, 893, 899 
Hubble’s law, 11 
Hulthen wave function, 722 
Hydrogen atom, 657-658, 719-720, 790 
Hydrogenic momentum wave function, 720, 
723 

Hyperbolic cosine, 860 
Hyperbolic PDE, 470 
Hypergeometric functions, 268, 299 
Hyperion, 901 
Hypocycloid, 836 

I 

Ill-conditional systems, 220-221 

Ill-conditioned matrix, 221 

Imaginary part, 320 

Impulse function, 731 

Impulsive force, 733 

Incomplete gamma functions, 545 

Independence of reference frame, 141 

Independent events, 785 

Independent random variables, 795 

Indicial equation, 443 

Inertia matrix, 211-212 

Inertia tensor, 147 

Inertial frames, 248n 

Inequality 

Bessel, 512-513 
Cauchy, 347 
Chebyshev, 793-794 
Schwarz, 513-514 

Infinite-dimensional Hilbert space, 519 
Infinite products. See Product expansion of 
entire functions, 392-394. 

Infinite series, 257-317. See also Series 
Abel’s test, 279-280 
addition/subtraction of series, 260-261 
algebra of series, 274-276 
alternating series, 269-274 
asymptotic series, 314-316 
Bernoulli functions, 305 
Bernoulli numbers, 302-305 
binomial theorem, 284—286 
convergence. See Convergence 
convergence tests, 262-269. See also 
Convergence tests 
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divergence, 257 
elliptic integrals, 296-302 
Euler-Maclaurin integration formula, 
306-307 

geometric series, 258-259, 261 
harmonic series, 259-260 
Leibniz criterion, 270 
Maclaurin theorem, 283-284 
multiplication of series, 275-276 
power series, 291-294 
Rieman zeta function, 266, 305, 308, 

311, 686 

series of functions, 276-281 
Taylor’s expansion, 281-291 
term-by-term differentiation, 280 
Weierstrass M test, 278-279 
Zeta function, 266, 305, 308, 311, 686 
Inhomogeneous Euler ODE, 430-431 
Inhomogeneous linear equations, 161-162 
Inhomogeneous ODE, 411, 414, 415 
Inhomogeneous ODE with constant 
coefficients, 431^32 
Inhomogeneous PDE, 470 
Inhomogeneous PDE (Green’s function), 
769-776 

Initial conditions, 759 
Inner product, 485 
Integral 

Bromwich, 746-752 
Cartesian coordinates, in, 98 
contour, 337-339 
convolution, 713-714 
cosine, 547 

cylindrical coordinates, in, 101-102 

definite, 379-384 

differentiation of, 461 

elliptic, 296-302 

error, 548 

Euler, 524-526 

exponential, 546-548 

Fresnel, 398 

integration, 491-492 

line, 59-62 

logarithmic, 547 

orthogonality, 584 

overlap, 485 

Parseval’s relation, by, 716 
Riemann, 59 
sine, 547 

spherical polar coordinates, in, 

130 

surface, 62-65 
volume, 65-66 
work, 103 

Integral test, 264-266 


Integral transforms, 689-755 
defined, 689 

Fourier transform. See Fourier transform 
Laplace transform. See Laplace transform 
linearity, 689-690 

momentum representation, 718-721 
Integration. See also Integral 
Fourier series, 684-685 
integral definitions of 

gradient/divergence/curl, 66-67 
Laplace transform, 738-739 
power series, 292 
vector, 58-68 

Integration interval, [a, b], 491^92 
Intermittency, 902 
Internal force, 20 
Intersection, 784 

Introduction to Quantum Mechanics 
(Griffiths), 630 
Invariant, 148 
Invariant subgroup, 231 
Inverse cosine transform, 696 
Inverse Fourier transform, 694-698, 709 
Inverse Laplace transform, 746-754 
Inverse operator, uniqueness of, 690 
Inverse polynomial, 381-382 
Inverse transform, 726 
Inversion, 361-363 

calculus of residues, 748 
mapping, 361-363 
matrix, 184-189 
numerical, 729 
PDE, 708-709 
power series, 293-294 
square pulse, 696 

Inversion of power series, 293-294 
Irreducible, 235 
group representations, 234-236 
Irregular/regular singularities, 448-450 
Irregular singular point, 440 
Irregular solution, 460 
Irrotational, 50, 51 
Isolated singular point, 372 
Isomorphism, 234 
Isoperimetric, 860 
Isospin, 229 
Isotropic, 146 

J 

Jacobi, Carl Gustav Jacob, 120 
Jacobi-Anger expansion, 609 
Jacobi identity, 189 

Jacobi identity for vector products, 34n 
Jacobi technique, 223 
Jacobian, 119, 742, 800 
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Jacobians for polar coordinates, 119-120 
Jensen’s theorem, 401, 402 
Joining conditions, 489 
Jordan’s lemma, 383 
Julia set, 877 

K 

Kaon decay, 253 
Kepler’s first law, 105 

Kepler’s law in cylindrical coordinates, 104 
Kepler’s orbit equation in polar coordinates, 
106 

Kirchhoff diffraction theory, 345 
Kirchhoffs laws, 159, 331, 421, 735 
Klein-Gordon equation, 757 
Koch curve, 876 
Kronecker, Leopold, 141 
Kronecker delta, 140, 141, 146, 180 
Kr onecker product, 182 
Kronig-Kramers optical dispersion relations, 
751n 

L 

l’Hopital’s rule, 288, 546, 612 
Ladder operator, 245 
Ladder operator approach, 244-247 
Lagrange, Joseph Louis comte de, 40 
Lagrange multiplier, 212 
Lagrange multiplier method, 39^0 
Lagrange undetermined multiplier, 849 
Lagrangian, 841 

Lagrangian equations of motion, 841-843, 
853 

Lagrangian multipliers, 848-852 
Laguerre, Edmond Nicolas, 655 
Laguerre differential equation, 462 
Laguerre functions, 657 
Laguerre ODE, 452 
Laguerre polynomials, 508, 650-661 
associated, 486, 492, 508, 655-658 
generating function, 652 
hydrogen atom, 657-658 
Laguerre’s ODE, 650-652 
lowest polynomials, 653 
normalization, 509, 654 
orthogonal functions, 486, 492, 508 
orthogonality, 654-655 
recursion relation, 653-654 
Rodrigues’s formula, 653 
special integrals, 655 
Lane-Emden equation of astrophysics, 

469 

Langevin theory of paramagnetism, 294 
Laplace, Pierre Simon, 54 
Laplace convolution, 742-746 


Laplace equation, 84, 571, 572, 680, 756, 759, 
846 

Laplace equation of electrostatics, 54 
Laplace PDE, 846 
Laplace series, 588 
Laplace transform, 693, 724-752 
Bessel’s function, 421 
Bromwich integral, 746-752 
convolution theorem, 742-746 
defined, 724 

derivative of a transform, 737-738 
derivatives, of, 730-734 
Dirac delta function, 732 
Euler integral, 693 
integration of transforms, 738 
inverse, 746-754 
inverse transform, 726 
numerical inversion, 729 
operational theorems, 752 
partial fraction expansion, 726-727 
RLC analog, 735-736 
substitution, 734 
translation, 736 
Laplacian, 54—55 

Laplacian development by minors, 164 

Laurent, Pierre-Alphonse, 357 

Laurent expansion, 350-359 

Laurent expansion by integrals, 356-357 

Laurent expansion by series, 357 

Laurent series, 354-358, 677 

Law 

area law for planetary motion, 104-107 

Bayes’s decomposition, 796 

Biot-Savart, 35 

conservation, 229 

cosines of, 19, 558 

decay, 412 

Faraday’s, 74 

Faraday’s induction, 56, 74 
Gauss’s, 82-84 
Hubble’s, 11 
Kepler’s, 104, 105 
Kirchhoffs, 159 
Malthus’s, 879 
Oersted’s, 57, 74 
Snell’s law of refraction, 836 
Stefan-Boltzmann, 539 
triangle law of vector addition, 2 
Law of cosines, 19, 558 
Law of sines, 28 
Laws of large numbers 

binomial distribution, 797-799, 802-806 
continuous Gauss distribution, 810 
Gauss distribution, 809 
Poisson distribution, 806 
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Legendre, Adrien Marie, 558 
Legendre differential equation, 461 
self-adjoint form, 565 
singularities, 441, 452 
Legendre duplication formula, 528, 

625 

Legendre equation, 451, 487, 565 
self-adjoint form, 565 
Legendre ODE, 565, 568 
Legendre polynomials, 276, 534, 552-588 
associated, 486, 492, 581-586 
differential equations, 564-565 
electric multipoles, 558-559 
electrostatic potential of ring of charge, 
573-574 

generating function, 553-555 
Gram-Schmidt orthogonalization, 

506-507 

Legendre series, 569-570 
Legendre’s ODE, 565, 568 
linear electric multipoles, 558-559 
lowest, 557, 579 

orthogonal functions, 486, 492, 508 
orthogonality, 568-579 
parity, 556 

physical basis (electrostatics), 552-553 
polarization of dielectric, 577 
power series, 556 
recurrence relations, 563-564 
Rodrigues’s formula, 579 
shifted, 486, 492, 508, 509 
sphere in uniform field, 570-573 
spherical harmonics, 584—586 
table of functions, 554 
upper/lower bounds for P n , 566 
uses, 552 

vector expansion, 559-560 
Legendre series, 269, 569-570 
Leibniz, Gottfried Wilhelm von, 172, 318 
Leibniz criterion, 270 
Leibniz differentiation formula, 582 
Leibniz formula for derivative of an interval, 
461 
Lemma 
Jordan’s 383 
Riemann’s 691 
Levi-Civita symbol, 153-157 
l’Hopital’s rule, 288, 546, 612 
Lie, Sophus, 233 
Lie algebra, 237 
Lie groups, 230 
Limit cycle, 892 
Limit tests, 267 
Line integral, 59-62 
Line integral for work, 59-61 


Linear electric multipoles, 558-559 
Linear electric octopole, 561 
Linear electric quadrupole, 559, 561 
Linear equations, 159-162, 167-168 
Linear first-order ODEs, 414-418 
Linear ODE, 410 
Linear space, 9 

Linearly independent solution, 416 

Liouville, Joseph, 501 

Liouville’s theorem, 348 

Local bifurcations, 898 

Logarithmic integral, 547 

Logistic map, 869-874 

Lorentz, Hendrik Antoon, 254 

Lorentz covariance, 248 

Lorentz-Fitzgerald contraction, 155, 254 

Lorentz force, 22 

Lorentz frame, 148 

Lorentz gauge, 57 

Lorentz group, 254 

Lorentz invariant energy squared, 252 
Lorentz invariant infinitesimal version, 

251 

Lorentz invariant momentum transfer 
squared, 253 
Lorentz line shape, 745n 
Lorentz scalar, 249 

Lorentz transformations, 147, 156, 248-251 
Lorenz coupled NDEs, 902 
Lowering operator, 245, 639-642 
Lowest associated Legendre polynomials, 
583-584 

Lowest Legendre polynomials, 557, 579 
Lowest spherical harmonics, 585-586 
Lyapunov exponent, 874-875, 900 

M 

Maclaurin, Colin, 286 
Maclaurin expansion, 537 
Maclaurin expansion of the exponential, 
227 

Maclaurin integral test, 264-266 
Maclaurin series, 46, 723 
Maclaurin theorem, 283-284 
Madehmg constant, 276 
Magnetic field, 48, 56-58, 71, 73-76, 81, 102, 
103-104, 111-112, 147 
Magnetic flux, 102-104 
Magnetic induction of long wire, 111-112 
Magnitude, 322 
of scalar, 1 
of vector, 5 
Malthus’s law, 879 
Mandelbrot set, 877 
Maple, 171, 450, 465, 537, 548 
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Mapping, 360-368 

branch points/multivalent functions, 
363-367 

conformal, 368-370 
inversion, 361-363 
rotation, 361 
translation, 360 

Marginal probability distribution, 796 
Matching conditions, 489 
Mathcad, 171, 537, 548 
Mathematica, 171, 450, 465, 537, 548 
Mathematical manipulation computer 
software, 171 

Mathematical Methods for Physicists 
(Arfken, Weber), 50, 96 
Mathematicians. See Biographical data 
Mathematics of Computation, 622 
Matlab, 171 
Matrices, 174-228 
addition/subtraction, 179-180 
anti-Hermitian, 216 
defined, 174 

determinant of matrix product, 181-182 
diagonal, 182-183 
diagonalization, 211-228. See also 
Diagonalization of matrices 
direct product, 182 
equality, 175 

Euler angle rotation, 200-201 

Gauss-Jordan matrix inversion, 186-188 

Hermitian, 207, 209, 214—216 

Hilbert, 221 

ill-conditioned, 221 

inertia, 211-214 

inversion, 184—189 

ladder operators, 244-247 

minor, 164-165 

multiplication, 175-180 

null, 180 

orthogonal, 193-204. See also Orthogonal 
matrices 
Pauli, 208-209 
product theorem, 180-182 
rank, 175 

representation, 234-236 

similarity transformation, 202-204 

singular, 224 

symmetry, 146-147, 202 

trace, 184 

transpose, 202, 204 

transposition, 178 

unit, 180 

unitary, 207, 209 

vector transformation, 136-144, 176, 184, 
188, 193-204, 207 


Matrix addition, 179-180 
Matrix inversion, 184-189 
Matrix multiplication, 175-180 
Maximum surface tension, 852 
Maxwell-Boltzmann (MB) statistics, 788 
Maxwell’s equations, 56-57 
Lorentz covariance of, 248 
partial differential equations, 48, 55-57, 
73-74, 82-85 

McMahon’s expansion, 908 
Mean value, 791 
Mean value theorem, 282, 347n 
Mecanique Analytique (Lagrange), 40 
Mellin transform, 729 
Meromorphic function, 372, 390 
Method of least squares, 792 
Method of steepest descents, 400^409 
Methods of Mathematical Physics 
(Courant/Hilbert), 758 
Metric, 115 

Minkowski space, 121, 144, 249 
Minkowski space-time, 251-254 
Mittag-Leffler theorem, 390-392 
Mixed tensors, 144 

Modified spherical Bessel functions, 778 
Modulus, 322 

Moment generating function, 794 
Moment of inertia, 63-64 
Moment of inertia ellipsoid, 213 
Moment of inertia matrix, 211-212 
Momentum representation, 718-721 
Momentum wave equation, 721 
Monopole, 559 
Morera’s theorem, 346-347 
Mott cross section, 701 
Movable singularity, 881-882 
Moving particle, 842-843 
Multiplet, 235 
Multiplication 

complex number, 321 
complex variable, by, 324-325 
matrix, 175-180 
scalar, 9, 10 
series, 275-276 

vector. See Vector multiplication 
Multipole expansion, 560 
Multivalent functions, 363-367 
Multivalued function, 326, 374 
Mutually exclusive, 782, 783 

N 

N-f old degenerate, 501 
Navier-Stokes equation, 55 
NDEs. See Nonlinear differential equations 
(NDEs) 
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Negative of a vector, 9 
Neumann, Karl, 613 

Neumann boundary conditions, 634, 758, 759 
Neumann functions, 611-617. See also 
Spherical Neumann functions 
Y 0 (x), 460, 612 
Yi_(x), 612 
Y 2 (x), 612 
Y v (x), 612, 616 
Neumann problem, 767 
Neumann series solution (quantum 
mechanical scattering), 771-774 
Neutrino energy density (Fermi distribution), 
539 

Neutron transport theory, 295 
Newton’s equation for a single particle, 252 
Newton’s equations of motion, 20, 117, 842 
Newton’s formula for finding a root, 287 
Newton’s law of universal gravitation, 84 
Newton’s method, 905 
Newton’s second law, 731, 733 
Node. See sink, attractor, 870 
Noether’s theorem, 229 
Nonlinear differential equations (NDEs), 
878-903 

autonomous differential equations, 882 
Bernoulli equations, 879-880 
bifurcations in dynamic systems, 898-899 
chaos in dynamic systems, 900-902 
dissipation in dynamic systems, 896-898 
fixed singularities, 881-882 
local/global behavior in higher dimensions, 
885-896 

Poincare section, 878 
Riccati equations, 880-881 
routes to chaos in dynamic systems, 
901-902 

special solution, 882 
Verhulst’s NDE, 882, 883 
Nonperiodic function, 694 
Nonuniform convergence, 277-278 
Normal distribution, 807-812 
Normal modes of vibration, 218-220 
Normalization, 600 
Bessel functions, 600 
Gram-Schmidt orthogonalization, 503-506 
Hermite polynomials, 509 
hydrogen momentum wave fmiction, 723 
Laguerre polynomials, 509, 654 
momentum representation, 719 
orthogonal polynomials, 508 
Normalized hydrogen wave function, 658 
Normalized simple harmonic oscillator wave 
function, 649 

Nuclear stripping reactions, 636 


Null matrix, 179 
Null vector, 9 
Number operator, 641 
Numerical Recipes, 622 
Numerical solutions (ODE), 467^70 

o 

O(n), 232, 233 

Odd permutation, 155n 

Odd symmetry, 445 

ODE, 410. See also Differential equations 
ODE with constant coefficients, 428^,29 
Oersted’s law, 57, 74, 103, 104, 111 
Ohm’s law, 72, 73, 159 
Olbers’s paradox, 268 
One-dimensional, time-independent 
Schrodinger equation, 723. See also 
Schrodinger (wave) equation 
One-sided Laplace transform, 724n 
Operator 
adjoint, 484 

curl, 47-51, 110-112, 124-125, 129 
divergence, 44-47, 108-109, 122-123, 129 
gradient, 41-43, 107-108, 121-122, 129 
Hermitian. See Hermitian operators 
lowering, 639-642 
number, 641 
projection, 505 
raising, 639-642 
self-adjoint, 484 
Optical dispension, 750 

Optical path near event horizon of black hole, 
831-832 

Orbital angular momentum, 243-248 
ladder operator approach, 244—247 
rotation of functions, 239-240 
Orbital angular momentum equation, 586 
Orbital angular momentum in cylindrical 
coordinates, 118-119 
Order (determinants), 163 
Order of a Lie group, 237 
Order of matrix, 175 

Ordinary differential equation (ODE), 410. See 
also Differential equations 
Ordinary point, 439, 440 
Orthogonal azimuthal functions, 587 
Orthogonal coordinates, 113-121 
Orthogonal eigenfunctions, 498-500 
Orthogonal functions, 482-522 
eigenvalue/eigenvector. See 
Eigenvalue/eigenvector 
Gram-Schmidt orthogonalization, 503-510 
Hermitian operators. See Hermitian 
operators 

Legendre polynomials, 506-507 
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Orthogonal functions ( cont .) 

orthogonal polynomials, 507-508 
self-adjoint ODEs, 483^196 
Orthogonal matrices, 193-204 
direction cosines, 194-195 
Euler angles, 200-201 
similarity transformation, 204 
symmetry, 202 
tensors, and, 204 
two-dimensional case, 198-200 
vectors, and, 195-198 
Orthogonal polynomials, 507-508 
Orthogonality, 498 
Bessel functions, 599-600 
Fourier series, 499, 664 
Hermite polynomials, 646-647 
Laguerre polynomials, 654-655 
Legendre polynomials, 568-579 
vectors, 483 

Orthogonality integral, 584 

Orthonormal, 503 

Oscillatory series, 261 

Overlap integral, 485 

Overshoot (Gibbs phenomenon), 666 

P 

Parabolic PDE, 470 
Parachutist, 412-413, 896 
Parallelogram addition, 3 
Parallelogram of forces, 6-7 
Parallelepiped, 30-31 
Parity 

associated Legendre polynomials, 584 
even, 446 

Fourier sine/cosine transform, 703 
Hermite polynomials, 644-645 
Legendre polynomials, 556 
spherical harmonics, 586 
Parseval’s identity, 517, 521, 686-687 
Parseval’s relation, 715-716 
Partial derivatives, 35, 470 
of a plane, 36-37 

Partial differential equations (PDEs), 470, 
756-781 

alternate solutions, 763-765 
boundary conditions, 758-760 
diffusion PDE, 760-761 
examples of PDEs in physics, 756-757 
first-order PDEs, 758 
heat flow PDE, 708, 760-761 
Helmholtz’s PDE, 763, 776 
inhomogeneous PDE, 769-776. See also 
Green’s function 
inversion of PDE, 708-709 
Poisson’s PDE, 709 


solving second-order PDEs, 757-758 
terminology, 470 

Partial fraction expansion, 726-727 
Partial sum, 257 
Particle in a box, 850 
Particle spin, 147n 
Particular solution, 460 

Particular solution of homogeneous ODE, 416 
Pascal, Blaise, 825 
Passive transformation, 202 
Path-dependent work, 59 
Pauli, Wolfgang, 209 
Pauli exclusion principle, 859 
Pauli spin matrices, 190, 208-209 
Pauli theory of the electron, 58 
PDEs. See Partial differential equations 
(PDEs) 

Pendulum 
damped, 892 
periodically driven, 902 
simple, 296-297, 853-854 
Period doubling, 901-902 
Period of cycle, 871 
Periodically driven pendulum, 902 
Permutations, 155n, 787-788 
Phase, 323 
Phase shift, 630 
Phase space, 91, 868 
Phase transition, 875 
Physicists. See Biographical data 
Piecewise regular, 664 
7T mesons, 147 
Pi meson (pion), 147 
Pion photoproduction threshold, 253 
Pitchfork bifurcation, 871, 899 
Planck’s black-body radiation law, 311 
Planck’s theory of quantized oscillators, 289 
Plane circular membrane, vibrations, 600-604 
Pochhammer symbol, 538 
Poincare, Jules Henri, 867, 869 
Poincare-Bendixson theorem, 896 
Poincare group, 248 
Poincare section, 878, 879, 900 
Poisson, Simeon Denis, 85 
Poisson distribution, 804-806, 810 
Poisson’s equation, 84-85, 756 
Poisson’s PDE, 709 
Pole, 373 

Pole expansion of cotangent, 391-392 
Pole expansion of meromorphic functions, 
390-391 

Polygamma function, 536 
Polynomials 

associated Legendre, 486, 492, 581-586 
associated Laguerre, 486, 492, 508, 655-658 
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Hermite. See Hermite polynomials 
Laguerre. See Laguerre polynomials 
Legendre. See Legendre polynomials 
orthogonal, 507-508 
shifted Legendre, 486, 492, 508, 509 
Potential, 41 
centrifugal, 79-80 
of conservative force, 76-80 
Coulomb, 45 
gravitational, 61-62 
scalar, 76-78 
vector, 55 

Potential theory, 76-81 
Power series, 291-294 
continuity, 292 
convergence, 291 
differentiation/integration, 292 
inversion of, 293-294 
Legendre polynomials, 556 
matrix, of, 221-222 
uniqueness theorem, 292-293 
Power series expansions 

differential equations (Frobenius’s method), 
441^54 
erf z , 548 

exponential integral, 547-548 
Power spectrum, 670 
Predictor-corrector methods, 467-^168 
Prime number theorem, 273 
Principal axes, 212-216 
Principal value, 326 
Probability, 782-825 
basic principles, 783 
binomial distribution, 802-804, 810 
central moments, 794 
Chebyshev inequality, 793-794 
coin tossing, 783 
conditional, 785 
correlation, 795 
covariance, 795 
Gauss distribution, 807-812 
marginal, 796 

method of least squares, 792 
normal distribution, 807-812 
permutations and combinations, 787-788 
Poisson distribution, 804-806, 810 
random variables, 789 
repeated draws of cards, 796-799 
repeated tosses of dice, 802 
simple properties, 782-786 
standard deviation, 792 
statistics, 812-825. See also Statistics 
sum, product, ratio of random variables, 
800-801 
variance, 793 


Probability amplitudes, 521 
Product convergence theorem, 275 
Product expansion of entire functions, 
392-394 

Product theorem for determinants, 180-182 
Projection operator, 505, 522 
Proton charge form factor, 701-703 
Pseudotensor, 153 
Pseudovector, 142 
Pythagorean theorem 
area law, 105 
Cartesian coordinates, 97 
scalar product, 19 
vector, 5 

Q 

Quadrupole, 560 
Quadrupole moment tensor, 587 
Quadrupole tensor, 147 
Quantization of angular momentum, 452 
Quantum chromodynamics, 703 
Quantum mechanical angular momentum 
operators, 587 

Quantum mechanical particle in a sphere, 628 
Quantum mechanical scattering, 387-390, 
771-776 

Quantum mechanical simple harmonic 

oscillator. See Simple harmonic oscillator 
Quantum theory of atomic collisions, 397 
Quark counting, 703 
Quasiperiodic motion, 893 
Quasiperiodic route to chaos, 902 
Quaternions, 139n 
Quotient rule, 151-153 

R 

Radial ODE, 487 

Radial Schrodinger wave equation, 463 

Radial wave functions, 486, 630 

Radioactive decay, 412 

Raising operator, 245, 639-642 

Raleigh expansion, 636 

Random event, 782 

Rank (matrix), 175 

Rapidity, 251 

Ratio test, 262-263 

Rational functions, 390-391 

Rayleigh, John William Strutt, Lord, 861 

Rayleigh equation, 580, 778 

Rayleigh formulas, 629 

Rayleigh plane wave expansion, 574 

Rayleigh-Ritz variational technique, 861-865 

Rayleigh’s theorem, 715n 

Real eigenvalues, 496-497 

Real part, 320 
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Real zeros of a function, 905-909 
Reciprocal cosine, 380-381 
Rectangular Cartesian coordinates, 97-98 
Recurrence relation 

Hermite polynomials, 643 
Laguerre polynomials, 653-654 
Legendre polynomials, 563-564 
Neumann function, 613 
spherical Bessel function, 629 
two-term, 443 
Recursion formula, 564 
Reduce, 171, 450, 465, 537, 548 
Regression coefficient, 815 
Regular function, 335n 
Regular/irregular singularities, 448^450 
Regular singular point, 440 
Regular solution, 460 
Relativistic particle, Lagrangian, 844 
Relaxation methods, 223 
Repeated draws of cards, 796-799 
Repelling spiral saddle point, 894-896 
Repellor, 870, 883, 884 
Residue theorem, 378-379, 394 
Resistance-inductance circuit, 418 
Resonance condition, 745 
Resonant cavity, 604-608 
Riccati equations, 880 
Riccati NDE, 880 
Richardson’s extrapolation, 467 
Riemann, Bernhard Georg Friedrich, 335 
Riemann integral, 59, 498n 
Riemann lemma, 691 
Riemann surface, 366, 367 
Riemann theorem, 272 
Riemann zeta function 
Bernoulli numbers, 305 
comparison test, 262-263 
convergence, 308-309 
Maclaurin integral test, 266-267 
polygamma function, 536 
table of values, 313 
Riemannian, 115 
Riemann’s lemma, 691 
Riemann’s theorem, 272 
Right-hand rule for positive normal, 63 
Ring, 180 
RL circuit, 418 
RLC analog, 735-736 
RLC circuit, 432, 735 
Rocket 

shortest distance between two rockets in 
free flight, 24-25 

shortest distance of observer from, 840-841 
shortest distance of rocket from observer, 
17-20 


Rodrigues’s formula/ representation, 579 
associated Laguerre polynomial, 656 
Hermite polynomials, 645 
Laguerre polynomials, 653 
Root test, 262 
Rosetta curves, 107 
Rossler coupled ODEs, 902 
Rotation, 361 

Rotation ellipsoid, 131-133 
Rotation groups SO(2) and SO(3), 238-239 
Rotation of functions/orbital angular 
momentum, 239-240 
Rotation-soap film problem, 832-834 
Rouche’s theorem, 393, 394 
Row vector, 178 
Rule 

addition, 784 
BAC-CAB, 32 
chain. See Chain rule 
Cramer’s, 162 
l’Hopital’s, 288 
quotient, 151-152 

right-hand ride for positive normal, 63 
Runge-Kutta method, 466-467 
Runge-Kutta-Nystrom method, 468 

s 

S-wave particle in an infinite spherical square 
well, 863-864 

Saddle point, 336, 401, 888-889, 898 
Saddle point axis, 404 
Saddle point in one dimension, 884 
Saddle point method, 402-407 
Sample, 783 
SAT tests, 785-786 
Sawtooth wave, 665-666 
Scalar Kirchhoff diffraction theory, 779 
Scalar multiplication, 9, 10 
Scalar potential, 76-78 
Scalar product of vectors, 12-20 
Scalar or inner product of functions, 483, 485, 
498, 516 

Scalar quantities, 1 
Scale factors, 116 

Scattering theory of relativistic particles, 91 
Schlaefli integral, 618 
Schrodinger equation potential, 634 
Schrodinger (wave) equation, 206 
central potential, 865 
constrained minimum, 856-857 
deuteron, 488 
eigenvalue problems, 223 
hydrogen atom problem, 478 
matrix representation, 234 
momentum representation, 718 
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one-dimensional, time-independent 
equation, 723 
PDEs, 757 

quantum mechanical scattering, 771 
single particle system, 485 
Schwarz inequality, 337, 513-514 
Schwarz reflection principle, 351-352, 

538 

Schwarz’s theorem, 358 
Second-order ODEs, 424-439 
missing variable x, 426 
missing variable y, 425-426 
Second-order PDEs, 470, 757-758 
Second-rank tensor, 144—145 
Second solution (ODE), 454-464 
Secular equation, 214 
Self-adjoint, 207, 484 
Self-adjoint ODEs, 483^96 
Self-similar sets, 875 

Separation of variables for ODEs and PDEs, 
411-412, 470-480 
Cartesian coordinates, 471^73 
circular cylindrical coordinates, 474-476 
first-order ODE, 411-412 
spherical polar coordinates, 476—478 
Series 

addition/subtraction, 260-261 
algebra of, 274-276 
alternating, 269-274 
asymptotic, 314-316 
Bessel, 600 

Fourier. See Fourier series 
geometric, 258-259 
harmonic.See Harmonic series 
infinite. See Infinite series 
Laurent, 354-358 
Legendre, 269, 569-570 
oscillatory, 261 
power, 291-294 
Stirling’s, 541-542 
Taylor. See Taylor expansion 
Series solutions (Frobenius’s method), 
441-454 

Series summation, 537 
Servomechanism, 745 

Shifted Legendre polynomials, 486, 492, 508, 
509, 729 

SHO equation, 451 

Shortest distance of observer from rocket, 
840-841 

Similarity transformation, 204, 231-233 
Simple harmonic oscillator 

driven harmonic oscillator, 707-708 
Hermite polynomials, 638-639 
momentum representation, 720-721 


normalized wave function, 649 
orthogonal functions, 486, 492 
3-D, 661 

Simple harmonic oscillator (SHO) equation, 
451 

Simple pendulum, 296-297, 853-854 
Simple pole, 373 
Simple unitary groups, 233 
Sine integral, 547 
product representation, 392 
Sine transform, 700 

Single harmonic oscillator, classical harmonic 
oscillator, 731-732 
Single well, 899 
Singular, 224 
Singular points, 439^41 
Singular solution, 419 
Singularities, 372-377 
Bessel, 441 

branch points, 374, 377 
essential, 373, 440 
fixed, 881-882 
movable, 881-882 
poles, 373-374 
regular/irregular, 448-450 
Singularity on contour of integration, 386-387 
Sink, 870, 883, 884 
Skewsymmetric matrix, 202 
SL(2), 236 

Sliding off a log, 854-856 
Snell’s law of refraction, 836 
SO(2), 231, 238-239 
SO(3), 233, 238-239 
SO(n), 232, 233 
Soap film, 832-834 
Solenoidal, 51 
Source, 414 

Special linear group SL(2), 236 
Special relativity, 249 
Special unitary group SU(2), 240-242 
Spherical Bessel functions, 624-637 
definitions, 624-627 
limiting values, 627-628 
numerical computation, 630 
recurrence relations, 629 
spherical Hankel functions, 624, 627 
spherical Neumann functions, 624, 626 
Spherical Green’s function, 778 
Spherical Hankel functions, 624, 627 
Spherical harmonic closure relation, 587 
Spherical harmonics, 584—586 
Spherical Neumann functions, 624, 626 
Spherical polar coordinates, 126-136, 760 
Spherical symmetry, 225 
Spherically symmetric heat flow, 766-767 
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Spin_particle, 242 
Spin states, 247 
Spin.wave functions, 240, 241 
Spin 1 particles, 147 
Spin 2 particles, 147 
Spin zero particles, 147 
Spinor wave functions, 209 
Spinors, 147 

Spiral fixed point, 890-891 
Spiral repellor, 889, 898 
Spiral saddle point, 894 
Spiral sink, 889, 898 
Spontaneous or movable singularity, 

881 

Square pulse, 691-692, 696 
Square wave, 500, 511 
Square wave-high frequencies, 

668-669 

Stable sink, 886-888 
Standard deviation, 792 
Standing spherical waves, 627 
Stark effect, 453, 661 
Statistical hypothesis, 812 
Statistical mechanics, 787-788 
Statistics, 812-825. See also Probability 
Chi squared distribution, 817-821 
confidence interval, 823-824 
error propagation, 812-815 
fitting curves to data, 815-817 
student t distribution, 821-823 
Steady-state current, 433 
Steepest descent, method of, 400^409 
Stefan-Boltzmann law, 539 
Step function, 727-728 
Stieltjes integral, 89n 

Stirling’s asymptotic formula, 268, 274, 810 
Stirling’s expansion of the factorial function, 
407 

Stirling’s formula, 316, 408, 542, 543, 806, 811, 
859 

Stirling’s series, 541-542, 543, 544 
Stokes, Sir George Gabriel, 75 
Stokes’s theorem, 72-73 
Bessel functions, 617, 623 
Cauchy’s integral theorem, 339-341 
curl (cylindrical coordinates), 110, 124 
magnetic flux, 103-104 
Straight line, 830-831 
Strange attractor, 877 
Structurally stable fixed points, 882 
Structurally unstable fixed points, 882 
Structure constants of a Lie group, 237 
Structure of the Nucleus (Preston), 331 
Student t distribution, 821-823 
Sturm, Jacques Charles, 501 


Sturm-Liouville equation, 861 
Sturm-Liouville theory, 485, 569. See also 
Orthogonal functions 
SU(2), 233, 240-242 
SU(n), 233 
Subgroup, 231 

Substitution (Laplace transform), 734 
Subtraction 

complex number, 321 
matrix, 179-180 
series, 260-261 
tensors, 145 
vector, 3 

Subtraction of sets, 784 
Sufficiency conditions, 829n 
Summation convention (tensor analysis), 
145-146 

Summing series, 537 
Superposition principle, 411 
Euler’s ODE, 427-428, 429 
homogeneous PDEs, 470 
linear electric multipoles, 558 
ODEs with constant coefficients, 

428-429 

separation of variables, 473 
vibrating membranes, 603 
Surface integrals, 62-65 
Surface of Hemisphere, 130 
Symmetric matrix, 202 
Symmetric tensor, 147 
Symmetry 
azimuthal, 571, 586 
ODEs, 445-446 
periodic functions, 671 
tensors, 146-147 

T 

Table of the Gamma Function for Complex 
Arguments , 543 

Tables of Functions (Jahnke/Emde), 476 
Taking the complex conjugate, 321 
Tangent bifurcation, 903 
Taylor, Brooke, 282-283 
Taylor expansion, 136-157, 350-351, 672 
Taylor series solution, 464-465 
Tensor analysis, 136-157 

addition/subtraction of tensors, 145 
contraction, 149 
contravariant tensor, 144—145 
covariance of cross product, 142 
covariance of gradient, 143 
covariant tensor, 144-145 
direct product, 149-151 
dual tensors, 153-157 
Levi-Civita symbol, 153-156 
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pseudotensor, 153 

quotient rule, 151-153 

rotation of coordinate axes, 137-142 

spinors, 147 

summation convention, 145-146 
symmetry-antisymmetry, 146-147 
tensors of rank two, 144—145 
Tensor of rank zero, 136 
Tensor of rank one, 136 
Tensor of rank two, 144-145 
Tensor of rank n, 136 
Tensor transformation law, 115n 
Term-by-term differentiation, 280 
Theorem 
Abel’s, 678 
addition, 594 
binomial, 284-286 
Cauchy’s integral, 337-343 
center manifold, 898 
Fuchs’s, 450 
Gauss’s, 68 
Green’s, 70 
Jensen’s, 401, 402 
Liouville’s, 348 
Maclaurin, 283-284 
mean value, 282 
Morera’s, 346-347 
Poincare-Bendixson, 896 
prime number, 273 
product, 180-182 
residue, 378-379 
Riemann’s, 272 
Rouche’s, 393 
Schwarz’s, 358 
Stokes’s, 72-73 
uniqueness, 292-293 
Theory of anomalous dispersion, 751n 
Theory of free radial oscillations, 907 
Three-dimensional harmonic oscillator, 661 
Time-dependent scalar potential equation, 
757 

Time-dependent Schrodinger equation, 206. 

See also Schrodinger (wave) equation 
Time-independent diffusion equations, 756 
Time-independent Schrodinger equation, 234, 
723, 771 

Total charge inside a sphere, 90 
Total variation of a function, 37 
Trace, 184 
Trace formula, 222 

Trajectory. See also orbit, 14, 17-18, 105-106, 
868, 878, 882-896 
Transfer function, 745 
Transformation theory, 136 
Transformations, 202-204 


Translation, 360, 736 
Translation invariance, 4n 
Transpose, 178 
Transposed matrix, 202, 204 
Transposition, 178 
Transversality condition, 839-841 
Traveling spherical waves, 627 
Treatise on Algebra (Maclaurin), 162 
Triangle inequalities, 324 
Triangle law of vector addition, 2 
Trigonometric Fourier series, 677 
Triple scalar product, 29-31 
Triple vector product, 31-33 
Two-index Levi-Civita symbol, 157 
Two-sided Laplace transform, 724n 

u 

238 U, 807 
U(l), 233 
U(n), 233 

Undershoot (Gibbs phenomenon), 666 
Uniform convergence, 276-277 
Union of sets, 784 
Uniqueness theorem, 292-293 
Unit matrix, 180 

Unit step function, 395, 687, 738-739 

Unit vector, 5 

Unitary groups, 233 

Unitary matrix, 207-209 

Universal Feigenbaum numbers, 873, 901 

Universal number, 875 

V 

Van der Pool nonautonomous system, 
893 

Variance, 793 

Variation of the constant, 416 
Variation with constraints, 850-852 
Variational principles. See Calculus of 
variations 

Vector. See also Vector analysis 
contravariant, 143 
covariant, 143 
defined, 1 
differentiation, 17 
direct product, 150-151 
geometric representation, 1 
negative of, 9 
null, 9 

orthogonal matrices, 195-198 

physics, and, 5 

row, 178 

unit, 5 

uses, 1 

Vector addition, 2-3 
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Vector analysis, 1-158. See also Vector 
BAC-CAB rule, 33 
cross product, 20-24 
curl, 47-51 

curved coordinates. See Curved coordinates 
Dirac delta function, 86 
divergence, 44-47 
elastic forces, 8-9 

electromagnetic wave equations, 56-57 

elementary approach, 1-12 

free particle motion, 14-17 

Gauss’s law, 82-84 

Gauss’s theorem, 68-69 

gradient, 35-44. See also Gradient 

Green’s theorem, 70 

medians of triangle meet in center, 25-26 
Minkowski space-time, 251-253 
parallelogram of forces, 6-7 
Poisson’s equation, 84-85 
potential theory, 76-81 
scalar potential, 76-78 
scalar product, 12-20 
shortest distance between two rockets in 
free flight, 24-25 

shortest distance of rocket from observer, 
17-20 

Stokes’s theorem, 72-73 
successive applications of the gradient 
operator, 53-58 

tensor analysis. See Tensor analysis 
triple scalar product, 29-31 
triple vector product, 31-33 
vector addition, 2-3 
vector integration, 58-68 
vector potential of constant B field, 48 
vector subtraction, 3 
Vector equality, 9 
Vector field, 5 
Vector integration, 58-68 
Vector multiplication 
cross product, 20-24 
direct product, 149-151 
scalar product, 12-20 
Vector potential, 55 


Vector product, 20-24 
Vector space, 9 
Vector subtraction, 3 
Velocity of electromagnetic waves, 
749-752 

Verhulst, P. F., 869 
Verhulst’s NDE, 882, 883, 900 
Vibrating membranes, 600-604 
Vibrating string, 863 
Volterra integral equations, 743 
Volume 

ellipsoid, 131 
rotated Gaussian, 66 
Volume integrals, 65-66 
von Staudt-Clausen theorem, 304 
Vorticity, 50n 

w 

Wave equation, 709-711 
Weierstrass, Karl Theodor Wilhelm, 528 
Weierstrass infinite product definition, 
526-528 

Weierstrass M test, 278-279 
Weight function, 485-487 
Wigner, Eugen Paul, 233 
WKB expansion, 314 
Work integral, 103 
Wronski, Jozef Maria, 436 
Wronskian 

Bessel functions, 613-614 
linear first-order ODE, 417 
second-order ODEs, 435, 436 
second solution, ODE, 454-456 
uses, 614 

Y 

Yukawa potential, 778 

z 

Zeeman effect, 216n, 235 
Zero-point energy, 639 
Zeros, of functions, 392, 905-908 
of Bessel functions, 598-599 
Zeta function. See Riemann zeta function 
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